Investigating Racial Bias and Attribution Error in Grading Student Performance

doi:10.21203/rs.3.rs-4350386/v1

Download PDF

Research Article

Investigating Racial Bias and Attribution Error in Grading Student Performance

https://doi.org/10.21203/rs.3.rs-4350386/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Standards-based grading (SBG) is a more recent approach to grading that aims to reduce the impact of teacher biases that affect grading. This study investigates whether SBG effectively mitigates biases related to race and attribution errors that can distort traditional grading methods. To achieve this, a quantitative factorial vignette experiment was conducted to analyze the qualitative feedback given on student performance, shedding light on teachers' evaluative reasoning under SBG. Findings indicate that despite the structured framework of SBG, the evaluative process remains susceptible to a wide range of influencing factors, though there were no significant findings related to racial bias.

The results of the factorial vignette experiment underscored that while SBG aims to objectify the grading process, some biases may still affect teachers' evaluations, highlighting the intricate web of factors involved in the evaluation process. The paper contributes significantly to the ongoing discourse surrounding modern grading systems, emphasizing the need for continuing evolution in grading methodologies to ensure fairness and objectivity in student assessments.

Grading

Standards-based Grading

Competency-based Grading

Cognitive Bias

Racial Bias

Attribution Error

Heuristics

Findings from this study of standards-based grading suggest that teachers have a cognitive bias against underperforming students. When teachers assessed the work of a student who was underperforming, their feedback suggested that the student’s writing skills could not be improved; in contrast, after reading a positive profile of a student, teachers encouraged the student to grow as a writer. Altogether, though the authors believe standards-based grading has merit, it is not immune to educator bias.

Various grading systems have gained traction in recent years to make grading more useful for student learning. These systems are based on educators' professional assessments of student performance, claiming to offer more detailed, timely, and customized evaluations of student work and progress (Buckmiller et al., 2017; Knight & Cooper, 2019; Muñoz, & Guskey, 2015). These systems are often considered as competency-based grading, portfolio-based grading, and standards-based grading, (DeCastro-Ambrosetti & Cho 2005) with standards-based grading receiving the most widespread application (Buckmiller et al., 2020; Erickson, 2011; Iamarino, 2014).

Standards-based grading is when teachers consider student work in relation to a set standard, in order to determine a student's proficiency. Supporters of standards-based grading argue that it is less susceptible to teacher subjectivity compared to conventional grading approaches, where teachers might assign grades like "A" or "C" based on personal feelings or perceptions about a student (Feldman, 2019; Quinn, 2020; Townsley & Wear, 2020). This research investigates this claim by assessing whether standards-based grading systems can effectively mitigate teacher biases, particularly those related to race and non-academic factors, like attribution error. Racial bias involves unconscious associations between a student's race and academic capabilities, while attribution error occurs when teachers unconsciously factor in unrelated student attributes when assessing their work.

Standards-based Grading, Defined

Standards-based grading refers to the collaborative process of collecting and interpreting student-produced evidence to give students an accurate perspective of their proficiency, interpersonal skills, and emotional wellness (Knight & Cooper, 2019; Link & Guskey, 2022). Standards-based grading is intended to assess how well a student's proficiency measures up to a standard and is represented by a categorical label such as exceeds or meets (Guskey, 2014).

Teachers review a body of evidence against a standard and consider the relevance of any recent evidence to determine a student's mastery of standards (Brookhart et al., 2019). When teachers think about how the existing body of evidence and recent scores predict future proficiencies in student development, teachers are better able to provide a more accurate grade based on the evidence of learning students produce.

Teachers communicate their judgment of a student's growth and proficiency when reporting their performance. Standards-based reporting presents grade predictions and learning trajectories based on visible evidence of learning. These trajectories evolve with learning, making the grade book a dynamic resource for reflection, calibration, and feedback for both students and teachers.

Heuristics

Often described as cognitive shortcuts or rules of thumb, heuristics serve as mental tools people employ to simplify complex decision-making tasks or problem-solving endeavors (Kahneman et al., 2002, 2021). These cognitive strategies provide the advantage of efficiency, allowing individuals to navigate the vast volume of information encountered daily more swiftly. However, they also carry certain drawbacks, as an increased reliance on such mental shortcuts heightens the risk of errors or susceptibility to cognitive biases (Kahneman et al., 2021).

In education, one crucial area where heuristic-driven biases can have a significant impact is the process of teachers evaluating students (Kahneman et al., 2021). Teachers may inadvertently employ various heuristics in their grading processes, potentially leading to errors or biases (Kahneman et al., 2016). Further, teachers might resort to heuristics to make speedy decisions due to time constraints or the ongoing demand for classroom attention from students (Gilovich et al., 2002).

As a case in point, the "shifting standards bias" may emerge when teachers unconsciously adjust their grading criteria based on students' characteristics or backgrounds. Additionally, the "expectation bias" might come into play when teachers' grading is influenced by their preconceived expectations of a student's performance rather than the objective quality of the work itself. As such, relying on heuristics can impact how teachers grade student work (Zhan et al., 2021). Both internal (e.g. personal dispositions, values, and expectations) and external (e.g. surroundings and occurrences) elements can influence teacher decision-making. Factors like the time of classes (morning or afternoon), sleep duration, classroom environment, or preceding educational experiences can influence a teacher's judgment (Kahneman et al., 2016).

When teachers sit down with a pile of student essays, internal and external factors may influence their grading. In this regard, Kahneman et al. (2021) state, "Performance evaluations are greatly inconsistent and rely more on the evaluator than on the actual performance being evaluated" (p. 7). This inconsistency leads to unfair grades for students, with implications for their future coursework and potential careers. White individuals tend to lean towards the belief that evaluations mirror the values of the system rather than the personal judgment of an individual, although this assumption doesn't consistently hold true (Kahneman et al., 2021). As depicted in Figure 1, numerous factors impact the judgment of teachers, implying that a student's assigned score may not invariably represent their comprehension of the subject matter.

Attribution Error

We now turn to a specific type of error that influences teacher judgment: attribution error, illustrated in the upper right of Figure 1. According to Harvey et al. (2014), Fritz Heider (1958), one of the early researchers in the field of attribution theory, depicted individuals as “naive psychologists, inherently intrigued by comprehending the roots of success and failure” (p. 128). In other words, people naturally seek to comprehend why events unfold as they do. Nevertheless, the reasons they attribute to why events occur are not always correct. Attribution error, or fundamental attribution error, refers to the tendency to over-emphasize personal characteristics and ignore situational factors when judging others' behavior (Weiner, 1980, 2012).

Additionally, Wang and Hall (2018) state that, "the actors [in an event], e.g. students, tend to attribute [behaviors, outcomes] more to situational forces or constraints, whereas the observers [of the same event], e.g. teachers, are more likely to attribute [behaviors, outcomes] to the actor’s capabilities" (p. 15). Such misattributions can result in a skewed understanding of the event, engendering biases and inconsistencies (Beckman & Rodriguez, 2021; Dweck, 2018; Kelley, 1967).

Looking closer at attribution error in the classroom environment, this tendency might translate into associating a student's subpar performance solely with their supposed lack of ability or effort, overlooking other potential factors like inadequate teaching methods or limited access to resources (Cohen, 2022; Graham, 2020). Within the grading process, attribution bias may occur when the teacher focuses more on who the student is than on the quality of their work, giving a student who typically does well high grades, for example, even if the current piece of work is low quality. For example, if a teacher scores a student’s work as high quality and provides feedback that the student’s performance implies the performance was due to the student’s innate intelligence (internal, stable factors), this could be an incorrect assumption (or attribution). Or if a teacher scores a student’s work low and attributes the outcome to lack of effort (internal, unstable), this misattribution may lead the teacher to provide erroneous feedback.

Overall, attribution error can be problematic when the teacher places more emphasis on either personal internal factors or on external factors rather than the actual quality of their work. Investigation into attribution error studies how individuals articulate their perceptions regarding behavior, outcomes, and the causality of incidents (Harvey et al., 2014; Heider, 1958; Kelley, 1967; Weiner, 1974, 1980; Carson, 2019). These studies elucidate how individuals interpret success or failure, even when such explications might not be immediately apparent to the people themselves (Hollyforde & Whiddett, 2002).

Racial Bias

The second type of bias we investigate in this study is racial bias. Implicit racial biases are "associations made by individuals in the unconscious state of mind [that] cause individuals to unknowingly act in discriminatory ways" (Maryfield, 2018, p. 1), and activated racial bias often positions people of color as potentially vulnerable (Bodenhausen et al., 2010; Fiske, 1998, 2015; Rogers et al., 2020). In the presence of individuals of a specific race or ethnicity, implicit biases can automatically activate (Fazio, 2001; Herring, 2013; Koppehele-Gossel, 2020). Implicit bias research indicates that many White Americans possess biases that favor their own ethnicity and display biases against Black individuals (Greenwald & Farnham, 2000).

Despite some incremental progress, the prevailing image of a school student in the United States remains that of a White, middle-class student (Lewis & Diamond, 2015; Preston, 2007). This bias toward privileging White students is evident across various dimensions of the education system, including practices, policies, and personnel. Additionally, racial bias can manifest itself through policies that appear race-neutral on the surface, exclusionary teaching methods, and discriminatory grading practices (Lewis & Diamond, 2015; Paslay, 2021). It is worth noting that some studies (Babad, et al., 1982; Starck et al., 2020) suggest that educators exhibit racial biases similar to those of the average American.

Just like anyone else, a teacher's ability to suppress implicit biases can be compromised by various physical and psychological demands, including factors like time constraints, resource limitations, and stress. The lack of resources and time can lead teachers to create inaccurate representations of their students, which may affect their pedagogy (Spencer et al., 2016). For example, teachers may refer students from the racial majority to specific programs more often than their minority counterparts (Tenenbaum & Ruck, 2007); this is compounded since the majority of teachers are white (Howard, 2006). Jacoby et al. (2016) hypothesized "that greater implicit bias increases whites' anxiety when teaching Black students, and that the resultant distraction and depletion will diminish the quality of their instruction and, subsequently, student learning" (p. 52).

Several studies found substantial differences in the performance evaluation of students of color when the teacher subconsciously activated a general stereotype that African-American and Latino students do not score better than their White and Asian students (Ready & Wright, 2011). Jacoby et al. (2015) recruited undergraduates for a study that included cross-race and same-race lesson pairs. The researchers assigned the White participant to the instructor role and a Black or White participant to the learner role. Teachers then taught a lesson to the students. After the lesson, the participants received a five-minute discussion period. Immediately following this period, the participants were separated. The learner then had five minutes to complete an exam, while instructors were measured for explicit bias. Their findings suggest that White teachers displayed more observable anxiety-related behaviors when teaching Black learners, such as increased speech incoherence, lack of eye contact, and the teachers' choice of physical positioning in the classroom.

Warikoo et al. (2016) suggest that negative implicit associations toward low-achieving groups (stereotypes) often abound in classrooms "not only because they are automatic and difficult to control, but also because they are pervasive" (p. 2). Second, well-intentioned teachers may "sometimes act on unconscious biases towards students from stigmatized groups" (p. 3). Third, "implicit racial associations consistently correlate with problematic feelings and behaviors that emerge during interracial interactions" (p. 3). This may affect student performance and perpetuate problematic feelings between the teacher and the student.

Similarly, studies conducted by Fox (2015) and Joshi et al. (2018) explored how teacher-student racial congruence influenced a teacher's assessment of student performance. Oates (2003) identified disparities in teacher evaluations of student performance, particularly pronounced in situations where a white teacher assessed a Black student. Another study suggested that racially incongruent teacher-student relationships may indirectly contribute to poorer performance among minorities, as ethnic majority teachers appeared to have lower expectations for ethnic minority students (van Ewijk, 2011). According to Bonefeld et al. (2020), "the judgment made by teachers about students' performance did differ in terms of student characteristics that were unrelated to performance, such as immigration background and gender, in addition to differing on performance-related variables" (p. 198).

Although the literature demonstrates racial disparities in expectations and evaluation of students and academic achievements (e.g., Irizarry, 2015; Yates & Marcelo, 2014; Reardon et al., 2019), other studies exploring educator racial bias show inconclusive results. For example, Pigott and Cowen (2000) explored the extent to which perceived student race factors into teachers' judgment of student work. To illustrate, if a teacher is negatively biased toward students of color, they may see those students not as talented as White students, which may cause them to judge the student's work based on that perception (Ferman & Fontes, 2021; Jussim, 1989; Jussim et al., 2020). Indeed, "any internalized racial prejudice can activate biases and lead teachers to use discriminatory performance evaluations" (Wood & Graham, 2010, p. 177).

In this section we discuss our sample size, all manipulations, and all measures in the study. Analyzing teacher feedback to students alongside their standards-based assessments has offered one way to examine some of the claims of proponents of standards-based grading. This research centered on examining two types of bias, racial bias and attribution error, and their influence on teachers' grading decisions within the context of standards-based grading models, and it is part of a larger study of cognitive bias in rubric evaluation (Author, 2022). These two biases are of utmost importance to investigate since standards-based grading is designed to mitigate grading bias (Iamarino, 2014; Muñoz & Guskey, 2015). The overarching research question guiding this study was: To what extent are teachers influenced by their cognitive biases when using standards-based rubrics to evaluate student performance? Specifically, to examine potential attribution error processes: How does a student's implied race or learning profile relate to the content conveyed in teacher feedback?

Research Design: Factorial Vignette Experiment Study

Factorial survey experiments “ask participants to respond to hypothetical objects or situations (vignettes)” (Sauer et al., 2020, p.196), and are a common tool used in the social sciences, perhaps because “they combine experimental design features (i.e. randomization) with the advantages of heterogeneous respondent samples” (Sauer et al., 2020, p.196). Factorial designs featuring random assignment improve upon correlational regression analyses from observational data because they can establish causation between studied variables (Mutz, 2011).

In this study, one independent variable tested for its effect on rubric scores was profile information (positive tone, negative tone, and name only), and the other independent variable was implied student race (names likely to be attributed to White and Black individuals). The dependent variable was the teacher's score of the student's writing mastery. Since there were two independent variables, one with three options and one with two options, there were six resulting conditions (see Table 1). The 3 × 2 factorial design was used to estimate the following: the main effect of profile information on student scores; the main effect of implied race on student scores; the mean scores given to negative, positive, and name-only profiles, and Jamal (intended to be read as a Black student) and Jake (intended to be read as a White student) profiles. Further, interaction effects between the two main manipulated variables (profile type and student race) were estimated. Including interaction effects in the statistical model means that one independent variable's effect on the dependent variable is theoretically expected to depend on the other independent variable. For example, it helped to understand if the effect of profile tone on student grades depended on implied race or if the effect of implied race on student grades depended on the profile tone.

To comprehensively examine the grading process, including the analysis of feedback, a mixed-method approach has been employed. This approach involves quantitatively analyzing the frequency of various types of teacher feedback, shedding light on the most common comments made by teachers regarding student writing. This method aligns with the concept described by Small (2011) as quantized qualitative data analysis, which allows for a nuanced exploration of the feedback process. This study’s design and its analysis were not pre-registered.

Name Selection

To ensure that the intended reaction to the name is evoked, the same criteria were used to select names as is with audit studies; selecting from names used in previous audit studies or using government population statistics (Butler & Crabtree, 2017; Gaddis, 2017a). The same criteria were used in this study. For student names, they were selected from lists containing first names historically associated with White males (Jake) (Levitt & Dubner, 2014) or Black males (Jamal) (Bertrand & Mullainathan, 2004). To avoid overcomplicating the results only first names were used in this study.

Profile Type Design

Similar to Pager's (2003) famous audit study, Mark of a Criminal Record, this experiment used three potential profiles (see Figure 2): one was the student's name only; the second profile included name and information about prior performance with a positive tone; the third included name and information about prior performances with a negative tone. This additional piece of profile information was included so that it could activate various biases as the teacher scored the essay. Thus, this experiment is designed to examine several potential sources of biases well as how they might interact with race.

Scoring Rubric Design

Each participant received the same standards-based grading rubric developed by Stevenson High School, Lincolnshire, Illinois to score the hypothetical student essay (See Figure 3). It is four-point mastery rubric using the A.P. curriculum for Grade 11 Language Arts. During the A.P. junior English curricular team meetings, the team collaboratively developed the mastery scale and associated success criteria. The first author of the study was part of this team.

In this study, participants were asked to fill out the rubric for one essay and then comment on how the student might do on a hypothetical second essay (see Appendix D). This study's rubric contained the critical components of effective rubrics (Brookhart & Chen, 2015; Hasan, 2022; Marzano, 2011). Further, the essay used in this study was written by an actual student from the 2019 AP English Language course.

Sampling

The study involved a sample of 219 educators who self-identified as White, possessed teaching experience, and were within the typical employment age range. The determination of the target sample size, which was set at 200 educators, was based on a G-Power power analysis. This analysis considered a small minimum detectable effect size of 0.20, a power level of 0.80, a single numerator degree of freedom (attributable to the study's design with two levels), six distinct groups (associated with assignment to one of six scenarios), and no covariates. The calculated minimum required sample size was n = 199, and I chose to maintain a sample size close to this figure due to financial constraints.

The decision to specifically include White educators was made because the study in part focused on racial bias, particularly exploring potential racial bias from White individuals towards Black individuals. Given the prevalent White middle-class ideology often regarded as the norm in U.S. educational institutions (Emdin, 2021), the dynamics of prejudice and racial oppression predominantly flow from White to Black individuals. Consequently, I limited the study to White participants to investigate whether standards-based grading might not be as equitable a grading system as it is often promoted to be, partly due to the predominance of White teachers in the U.S. (Dee, 2004; Williams, 2008; Leonardo & Boas, 2021).

For data collection, the first author designed a survey utilizing Qualtrics software and implemented a survey flow that automatically assigned respondents to one of six randomly allocated groups. Participant recruitment and survey distribution were facilitated by the online platform Prolific where informed consent was obtained. The platform recruited educators from all subject areas and levels of teaching experience, including both primary and secondary education. We have complied with APA ethical standards in the treatment of their sample, human or animal, or to describe the details of treatment and received IRB approval.

Procedure, Instrumentation, and Assignment

The participants were randomly assigned one of the three profiles to read (positive tone, negative tone, or name only) and one of the two names, but all were given the same essay. The data system Qualtrics balanced the random assignment, resulting in six sub-groups with approximately 35 participants in each group. Random assignment and balancing aimed to control the effects of gender, rubric experience, subject taught, and age on how a participant would grade these essays. A balanced sample would indicate that the randomization process was effective and helped rule out variances among respondents (individual differences/characteristics/demographics) as the reason for their responses.

Table 1 illustrates that the sample was almost evenly divided by gender and experience with rubrics. Specifically, 105 male participants (48%) and 112 female participants (51%) had prior experience with rubrics. Regarding age, 91 participants fell into the 26–35 age group (42%), and the next largest age category was 18–25, comprising 75 participants (34%). Regarding subject expertise, the majority of participants had experience teaching English, accounting for nearly 27% (59 participants), followed by math (15% or 33 participants), Science (14% or 30 participants), and Social Science (8% or 17 participants). Additionally, 66 participants listed other subjects as their teaching experience.

The descriptions of the subgroups, which were assigned to either the Jake or Jamal scenarios, displayed remarkable similarity, indicating that the random assignment effectively balanced these variables. For instance, among the 110 participants in the subgroup assigned to the Jake profile, 46 were males (41%), and 54 had previous rubric experience (49%). Furthermore, a majority of this subgroup (71%) were under age 35, with 31 participants aged 18-25 and 48 participants aged 26-35. In comparison, within the group assigned the Jamal profile, there were 59 males (54% of the 109 participants in this subgroup), 61 participants had rubric experience (56% of this subgroup), and the majority (70%) were under the age of 35, including 44 participants aged 18-25 and 43 participants aged 26-35. Additionally, 59 participants in both the Jake and Jamal profiles had experience teaching English, 33 had experience in math, and 30 had experience in Science. 17 Social Studies teachers and 66 participants listed other subjects in their profiles. In summary, the most significant difference between the subgroups was gender, but this disparity arose solely due to the random assignment process.

Measures

Independent variables were student race, signaled using two student names (Jake or Jamal), and profile type with the student’s previous performance information (name only, negative tone, or positive tone).

There were four quantitative ways the original study measured student performance: mastery scores on writing (on a scale of four, exceeds to still developing), thesis (on a scale of two, developed or not developed), evidence (on a scale of two, sufficient or not sufficient), and sophistication (on a scale of two, yes or no).

The study also measured student performance with feedback. There was an area on the rubric that asked participants to provide feedback to students. The feedback was used to determine to what the participating teacher attributed success or non-success.

Data were analyzed using Qualtrics (2022). We first discuss analysis of the scores and then analysis of the teacher feedback.

Quantitative Data Analysis

Ordered logistic regression (UCLA ARC, 2022) was deemed appropriate for this study’s ordinal dataset. The following significance levels were adopted:

1) A p-value of less than .05 for the implied race indicates that student race has a statistically significant effect on teachers’ assigned scores.

2) A p-value of less than .05 for profile type indicates that that profile type has a statistically significant effect on teachers’ assigned scores.

3) A p-value of less than .05 for the interaction between implied race and achievement profile indicates that the effect of student race depends on the profile type.

Using ordinal regression analysis, we could assess the effect of implied race and profile information on teachers’ assigned scores while detecting any interaction between implied race and profile information.

Qualitative Data Analysis

The qualitative data were derived from open-ended responses collected from participating teachers, typically consisting of 2–3 sentences each. This data collection occurred as the final part of the student feedback process, where participants were requested to provide feedback to the student regarding a hypothetical upcoming essay. We utilized qualitative coding techniques to capture the core and essential elements of the feedback segments (Saldana, 2021). The data were interpreted within Weiner's Attribution Model (as depicted in Table 2). Weiner's Model of Attribution (1974) is a widely cited framework in discussions related to attributions, with Weiner frequently employing it in educational contexts (1972, 1974, 2012, 2018). Weiner proposes "that not only do the success or failure of activities induce feelings of pride or shame, but also the interpretations that individuals assign to the reasons for success or failure" (Weiner 1980, as cited on p.31 of Hollyforde & Whiddett, 2002). Weiner's model encompasses two overarching dimensions: the locus of causality and stability. The locus of causality shows whether the cause of success or failure is attributed to internal or external factors. Conversely, stability evaluates whether the cause of success or failure is consistent (such as skills or task difficulty) or variable (like effort or luck). Table 2, provided in Appendix, delineates the key characteristics of Weiner's model. Two rounds of coding were conducted to ensure a thorough analysis, allowing for the identification and emphasis of the most pertinent features within the feedback (Coffey & Atkinson, 1996).

Limitations

This study adopted a factorial vignette survey to explore bias susceptibility in a standards-based grading system and its impact on teachers' student performance assessments, representing a simplified real-world teaching and grading scenario. However, it has several limitations worth discussing.

First, the essay employed is a retired A.P. exam student sample, not tailored for this study, potentially misaligning with some rubric criteria and affecting final grades. Also, the language in the rubric criteria might not reflect teachers' typical proficiency descriptions regarding thesis, evidence, and sophistication. There was limited time to address inter-rater reliability issues, and a lack of prior collaboration or common mastery definition for participants regarding the rubric. The inability to ensure consistent rubric or mastery scale use by evaluators is a significant limitation. As Kahneman et al. (2021) highlighted, evaluators might agree on performance proficiency but assign varied scores due to differing rubric interpretations. Such differences might have subtly influenced essay scoring, with evaluators possibly prioritizing different writing aspects. Generally, the standards-based rubric may not resonate with participating teachers' experiences.

Another limitation was the sample student names. Participants' interpretation of the student names might have varied despite efforts to select racially indicative ones (Jake and Jamal) (Conaway & Bethune, 2015; Gaddis, 2017a; Staats, 2016). Additionally, unlike a true audit study, participants knew they were being studied, which might have influenced outcomes. Budget constraints and prompt data collection might have limited the sample size and characteristics, affecting generalizability; for example, only white participants were recruited, some with limited rubric experience.

Lastly, a factorial vignette design to examine bias in educational settings, showcasing the design's limitations, simplifies complex human interactions, offering fewer constraints and incentives compared to real school-based decision-making, thus possibly not reflecting a real-world context accurately (Fish, 2017, p. 331).

Findings

When we review the findings in the below table, we first look at the most common attribute, which in all cases was internal stable. This finding may not be surprising as the field of education at its core is about growth and development. Next, we examine how the other attributions were distributed, and this is where we see some variance. Interestingly, when participants received the positive profile and neutral profile the second most common attribution was internal stable, but with the negative profile, the second most common attribution for Jamal was external stable. One interpretation for this distribution is that the teacher was using feedback to protect Jamal from realizations and feelings associated internal stable attribution, implicitly saying, “writing is just a hard thing to do, it’s not about you.” Another interpretation the findings, regarding the attribution distribution could be that the teacher sought to subtly communicate that the standard of writing may be out of reach for a student like Jamal.

Moreover, when Jamal was assumed a good student (positive profile, neutral profile), the attribution shifted to internal stable, perhaps communicating, “you are a naturally good writer.”

Further investigation of these results might lead one to ask that when the participating teacher received a positive profile, they may have been vulnerable to the halo effect heuristic (Dobelli, 2013; Kahneman et al. 2021) saying, “I knew Jamal/Jake was a good student and therefore they must be a good writer.” For more details see Table 3.

Racial Bias

For each of the three profile types (name only, positive, and negative), the average score was 2.52 for name only, 2.93 for positive tone, and 2.43 for negative. This shows that, on a descriptive level, positive profile types were associated with higher mastery scores. In the Jake subsample, the overall score for Jake was 2.35, and the name-only, positive, and negative profiles received 2.57, 2.91, and 2.45, respectively. For Jamal's subsample, the average mastery score was 2.39, with the name only being 2.47, the positive tone being 2.94, and the negative tone profile being 2.40. Teacher scores for the essays assigned to Jamal compared with those assigned to Jake, and no statistically significant differences were found.

Attribution Error

Most commonly, educators attributed the performance to internal, unstable factors (students can get better at X); their comments suggested that the students could grow and learn the right practice (see Table 4). However, a second important pattern occurred when students were scored overall below mastery (1 or 2 on the mastery rubric). When teachers scored an essay lower, the teachers were more likely to attribute the performance to external, stable factors, noting that a fixed writing standard of quality must be met. Their comments to students seemed to suggest that their writing did not meet these high standards, and they did not offer suggestions or hope for improvement. In contrast, when students scored mastery or above, the teachers attributed the performance to internal, stable factors. Their comments suggested that writing is a skill that some people have, “…and you have it!!" These comments seemed to imply that students had a fixed talent for writing. A final important pattern to note is that when participants received a positive profile (Jake/Jamal are good students), they tended to attribute the student's success to internal-stable conditions e.g. that Jake and Jamal are naturally smart.

Analyzing the feedback by profile type, there are similar findings to looking at the feedback overall: the majority of study participants attributed the performance to internal, unstable factors, using language in their feedback that signaled they felt the student could learn and grow (see Tables 4 and 5). One statistically significant difference, however, occurred when the participants received a positive profile. In these cases, participants were more likely to, their feedback seemed to attribute the performance to internal stable factors, or fixed ability. For example, one wrote, “See I knew you were a smart kid!” In contrast, when participants received a negative profile or name-only profile, they were more likely to attribute the performance to external stable factors, noting that “writing an essay is hard,” without suggesting ways to improve.

Standards-based grading has been touted to provide useful feedback to students on their growth as writers, while also minimizing bias, such as attribution error (Munoz & Guskey 2015). In analyzing the results of this study, half of the participants provided feedback attributing student ability to internal, unstable factors (such as malleable skill), while the remaining half ascribed students' performance to the other three factors (inherent ability, external fixed standard, or circumstantial factors). Regardless of the student profile type (Jake or Jamal), the majority of the feedback attributed performance success or failure to internal, unstable reasons, implying that students could enhance their skills through further practice. This type of attribution was the most prevalent in the results, which aligns with the educational goal of nurturing students' abilities and skills and the belief that students can improve their writing.

While less prominent, other findings showed that a teacher's preconceived expectations of their students could influence the implementation of standards-based grading, as some teachers’ feedback seemed to be related to having read a positive profile, versus a negative or name-only profile. This could lead to potential distortions such as over-consistency (past performances influence current evaluation), and halo effects (good impressions lead to increased scores), which hold particular significance - since participants in this study were given a positive, neutral or negative profile of Jake or Jamal.

Importantly, the study found no results of racial bias, with all Jake and Jamal profiles receiving similar scores and feedback. This finding is not in line with the results of many audit studies, which found racial bias triggered by names (Gaddis, 2019; Ghoshal, 2018; Neumark, 2012). Specifically, in educational research on racial bias, some studies have found limited or inconclusive effects on student grades (Quinn, 2020), while several others suggest its presence (Bergh et al., 2010; Jacoby-Senghor, 2016).

Based on our research findings, it is crucial to emphasize that disparities in factors such as race, student attitudes, pre-performance information, expectations, and backgrounds can lead teachers to make incorrect assumptions about the underlying reasons for a student's performance (Daneshzadeh & Sirrakos, 2018; Riegle-Crumb & Grodsky, 2010). Importantly, a teacher's interpretation of a student's performance, whether viewed as stemming from internal or external factors, has the potential to shape the teacher's future expectations for that student, potentially influencing a shift in grading standards (Peterson, et al., 2016). For example, if a student's essay does not align with a teacher's preconceived notions about their learning capabilities and attitudes, the teacher may be inclined to seek evidence that either supports this narrative or explains its deviation.

Furthermore, attribution errors can occur when a teacher attributes a student's low exam score to a lack of interest (internal and unstable) rather than the difficulty of the exam questions (external and stable). Similarly, in the case of high exam scores, a teacher might erroneously attribute a student's success to perceived mastery (internal, stable), easy exam questions (external, unstable), or even luck (external, unstable).

Adding to the complexity, our research aligns with the broader body of literature on race and attribution error, which indicates that White teachers are more inclined to attribute academic shortcomings of White students to situational factors (external), such as a challenging test, while attributing the failures of Black and Latinx students to personal factors (internal), like laziness or lack of interest (Lewis & Diamond, 2015; Schmalor et al., 2021; Steele, 2018).

Future Research: Can attribution error be minimized?

When considering implementing a change to a grading system, educators often scrutinize the grade book format and any additional time to assess students against course standards (Berns, 2020; O'Connor et al., 2018; Wormeli, 2020). Although these are essential considerations, it is my belief that educators often overlook the efficacy of teacher judgment. No grading model can completely mitigate the effects of bias in humans, but preparing teachers to be aware of them might help address this issue significantly. Thus, future research investigating the impact of cognitive bias and standards-based grading practices is needed.

Even with the need for future research, these preliminary findings suggest that standards-based grading does not eliminate all bias; schools need to spend time working with teachers around the issue of bias in grading. Reducing biases, including attribution errors, requires a multifaceted approach that involves self-awareness, education, and strategic measures. To begin with, it is important to encourage teachers to engage in self-reflection regarding their assessments and to question whether they may be attributing student performances to personal traits or external factors without substantial evidence. By acknowledging the potential for falling into the attribution error trap, educators can take the initial step in mitigating its impact (Cooper & Burger, 1980; Hattie, 2012, 2023; Michael, et al., 2016).

According to Kahneman et al. (2002, 2021), psychological biases are universal and produce variability in judgments because of individual differences such as behaviors, experiences, values, and backgrounds. Their studies claim that anything that reduces psychological biases can improve judgment. Individuals possess values, beliefs, and assumptions that aid in interpreting their experiences and navigating their interactions with others - ideally steering them towards positive and constructive pathways. However, intertwining these values with social interactions can engender implicit biases that skew their decisions, viewpoints, and actions. It is imperative to recognize, observe, and control these distortions to attain the objective of equity. This is similar in our classrooms - where pedagogy, evaluation, and values interact in myriad ways. We see this interplay often in the grading process and many see standards-based grading as the way to manage this interaction reliably.

While standards-based grading did not eliminate bias in this research study, we still consider it an important type of grading that can support students in strengthening their writing. As schools invest time and resources in implementing standards-based grading, it is important to consider how heuristics and mental models related to race, student past performance, and other factors influence the grading process. If teachers continue to be inclined to prioritize internal aspects, perceived motivations, assumed attitudes, or personalities to assess students’ work (Dobelli, 2013; Wilson, 2011), students’ growth as writers will likely be constrained.

Author Contribution

All authors whose names appear on the submission made substantial contributions to the conception or design of the work; or the acquisition, analysis, or interpretation of data; or the creation of new software used in the work; drafted the work or revised it critically for important intellectual content; approved the version to be published; and agree to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

Acknowledgement

The authors would like to acknowledge Dr. Jennifer Nelson for her invaluable counsel on the methodology and data analysis of this study.

Data Availability

The authors confirm that all data generated or analyzed during this study are included in this published article. In addition, the primary and secondary sources and the data shown in this study's findings were all publicly available at the time of submission.

Author. (2022).
Babad, E. Y., Inbar, J., & Rosenthal, R. (1982). Pygmalion, Galatea, and the Golem: Investigations of biased and unbiased teachers. Journal of educational psychology, 74(4), 459.
Bandura, A. (1997). Self-efficacy: The exercise of control. W H Freeman/Times Books/ Henry Holt & Co.
Bandura, A. (2023). Social Cognitive Theory: An Agentic Perspective on Human Nature. John Wiley & Sons.
Beckman, L., & Rodriguez, N. (2021). Race, Ethnicity, and Official Perceptions in the Juvenile Justice System: Extending the Role of Negative Attributional Stereotypes. Criminal Justice and Behavior, 00938548211004672.
Bertrand, M., & Marsh, J. A. (2015). Teachers' sensemaking of data and implications for equity. American Educational Research Journal, 52(5), 861–893.
Bodenhausen, G. V., & Richeson, J. A. (2010). Prejudice, stereotyping, and discrimination.
Brookhart, S. M., & Chen, F. (2015). The quality and effectiveness of descriptive rubrics. Educational Review, 67(3), 343–368.
Brookhart, S. M., Guskey, T. R., Bowers, A. J., McMillan, J. H., Smith, J. K., Smith, L. F., … Welsh, M. E. (2016). A century of grading research: Meaning and value in the most common educational measure. Review of educational research, 86(4), 803–848.
Carson, J. E. (2019). External relational attributions: Attributing cause to others' relationships. Journal of Organizational Behavior, 40(5), 541–553. https://doi.org/10.1002/job.2360
Cohen, G. L. (2022). Belonging: The science of creating connection and bridging divides. WW Norton & Company.
Conaway, W., & Bethune, S. (2015). Implicit Bias and First Name Stereotypes: What Are the Implications for Online Instruction?. Online Learning, 19(3), 162–178.
Cooper, H. M., & Burger, J. M. (1980). How teachers explain students’ academic performance: A categorizing free response academic attributions. American Educational Research Journal, 17(1), 95–109.
Daneshzadeh, A., & Sirrakos, G. (2018). Restorative justice as a double-edged sword: Conflating restoration of black youth with the transformation of schools. Taboo: The Journal of Culture and Education, 17(4), 2.
DeCastro-Ambrosetti, D., & Cho, G. (2005). Synergism in learning: A critical reflection of authentic assessment. The High School Journal, 89(1), 57–62.
Dee, T. S. (2004). The race connection: Are teachers more effective with students who share their ethnicity?. Education Next, 4(2), 52–60.
Dobelli, R. (2013). The art of thinking clearly: better thinking, better decisions. Hachette UK.
Dweck, C. S. (2018). Reflections on the legacy of attribution theory. Motivation Science, 4 (1), 17–18.
Fazio, R. H. (2001). On the automatic activation of associated evaluations: An overview. Cognition & Emotion, 15(2), 115–141.
Feldman, J. (2019). Beyond standards-based grading: Why equity must be part of grading reform. Phi Delta Kappan, 100(8), 52–55.
Ferman, B., & Fontes, L. F. (2021). Discriminating Behavior: Evidence of Teachers' Grading Bias. Available at SSRN 3797725.
Fiske, S. T. (1998). Stereotyping, prejudice, and discrimination.
Fiske, S. T. (2015). Intergroup biases: A focus on stereotype content. Current Opinion in Behavioral Sciences, 3(April), 45–50.
Gaddis, S. M., (2017a). “How Black are Lakisha and Jamal? Racial Perceptions from Names Used in Correspondence Audit Studies.” Sociological Science 4:469–89. https://doi.org/10.15195/v4.a19.
Gaddis, S. M. (2019). Understanding the “how” and “why” aspects of racial-ethnic discrimination: A multimethod approach to audit studies. Sociology of Race and Ethnicity, 5(4), 443–455.
Gilovich, T., Griffin, D., & Kahneman, D. (Eds.). (2002). Heuristics and biases: The psychology of intuitive judgment. Cambridge university press.
Graham, S. (2020). An attributional theory of motivation. Contemporary Educational Psychology, 61, 101861.
Greenwald, A. G., & Farnham, S. D. (2000). Using the implicit association test to measure self-esteem and self-concept. Journal of personality and social psychology, 79(6), 1022.
Guskey, T. R. (2014). On your mark: Challenging the conventions of grading and reporting. solution tree press.
Harvey, P., Madison, K., Martinko, M., Crook, T. R., & Crook, T. A. (2014). Attribution theory in the organizational sciences: The road traveled and the path ahead. Academy of Management Perspectives, 28(2), 128–146.
Hasan, A. A. A. (2022). Effect of Rubric-Based Feedback on the Writing Skills of High School Graders. Journal of Innovation in Educational and Cultural Research, 3(1), 49–58.
Hattie, J. (2012). Visible learning for teachers: Maximizing impact on learning. Routledge.
Hattie, J. (2023). Visible learning: The sequel: A synthesis of over 2,100 meta-analyses relating to achievement. Taylor & Francis.
Heider, F. (1958). The naive analysis of action.
Herring, D. R., White, K. R., Jabeen, L. N., Hinojos, M., Terrazas, G., Reyes, S. M., … Crites Jr, S. L. (2013). On the automatic activation of attitudes: a quarter century of evaluative priming research. Psychological Bulletin, 139(5), 1062.
Hollyforde, S., Whiddett, S. (2002). The Motivation Handbook. Cromwell Press, Trowbridge, Wiltshire, UK.
Jacoby-Senghor, D. S., Sinclair, S., & Shelton, J. N. (2016). A lesson in bias: The relationship between implicit racial bias and performance in pedagogical contexts. Journal of Experimental Social Psychology, 63, 50–55.
Jussim, L. (1989). Teacher expectations: Self-fulfilling prophecies, perceptual biases, and accuracy. Journal of Personality and Social Psychology, 57(3), 469.
Jussim, L., Careem, A., Honeycutt, N., & Stevens, S. T. (2020). Do IAT Scores Explain Racial Inequality?. In Applications of Social Psychology (pp. 312–333). Routledge.
Kahneman, D., Rosenfield, A. M., Gandhi, L., & Blaser, T. (2016). Noise: How to overcome the high, hidden cost of inconsistent decision making. Harvard Business Review, 94(10), 38–46.
Kahneman, D., Sibony, O., & Sunstein, C. R. (2021). Noise: a flaw in human judgment. Little, Brown.
Kelley, H. H. (1967). Attribution theory in social psychology. In Nebraska symposium on motivation. University of Nebraska Press.
Knight, M., & Cooper, R. (2019). Taking on a new grading system: The interconnected effects of standards-based grading on teaching, learning, assessment, and student behavior. NASSP Bulletin, 103(1), 65–92.
Koppehele-Gossel, J., Hoffmann, L., Banse, R., & Gawronski, B. (2020). Evaluative priming as an implicit measure of evaluation: An examination of outlier-treatments for evaluative priming scores. Journal of Experimental Social Psychology, 87, 103905.
Leonardo, Z., & Boas, E. (2021). Other kids' teachers: What children of color learn from White women and what this says about race, whiteness, and gender. In Handbook of critical race theory in education (pp. 153–165). Routledge.
Levitt, S. D., & Dubner, S. J. (2014). Freakonomics. B de Bolsillo.
Lewis, A. E., & Diamond, J. B. (2015). Despite the best intentions: How racial inequality thrives in good schools. Oxford University Press.
Link, L. J., & Guskey, T. R. (2022). Is standards-based grading effective?. Theory Into Practice, 61(4), 406–417.
Maryfield, B. (2018). Implicit racial bias. Justice Research and Statistics Association, 1–10. McConaughey, 1986
Marzano, R. J. (2011). Formative assessment & standards-based grading. Solution Tree Press.
Michael, R. D., Webster, C., Patterson, D., Laguna, P., & Sherman, C. (2016). Standards-based assessment, grading, and professional development of California middle school physical education teachers. Journal of Teaching in Physical Education, 35(3), 277–283.
Muñoz, M. A., & Guskey, T. R. (2015). Standards-based grading and reporting will improve education. Phi Delta Kappan, 96(7), 64–68.
Paslay, C. (2021). Exploring White Fragility: Debating the Effects of Whiteness Studies on America's Schools. Rowman & Littlefield Publishers.
Preston, J. (2007). Whiteness and class in education. Springer Science & Business Media.
Rogers, L. O., Rosario, R. J., & Cielto, J. (2020). The role of stereotypes: Racial identity and learning. In Handbook of the cultural foundations of learning (pp. 62–78). Routledge.
Sauer, C. G., Auspurg, K., & Hinz, T. (2020). Designing multi-factorial survey experiments: Effects of presentation style (text or table), answering scales, and vignette order.
Schmalor, A., Cheung, B. Y., & Heine, S. J. (2021). Exploring people’s thoughts about the causes of ethnic stereotypes. Plos one, 16(1), e0245517.
Spencer, S. J., Logel, C., & Davies, P. G. (2016). Stereotype threat. Annual review of psychology, 67, 415–437.
Staats, C. (2016). Understanding implicit bias: What educators should know. American Educator, 39(4), 29.
Starck, J. G., Riddle, T., Sinclair, S., & Warikoo, N. (2020). Teachers are people too: Examining the racial bias of teachers compared to other American adults. Educational Researcher, 49(4), 273–284.
Tenenbaum, H. R., & Ruck, M. D. (2007). Are teachers' expectations different for racial minorities than for European American students? A meta-analysis. Journal of educational psychology, 99(2), 253.
Townsley, M., & Wear, N. L. (2020). Making Grades Matter: Standards-Based Grading in a Secondary PLC at Work®. Solution Tree. 555 North Morton Street, Bloomington, IN 47404.
UCLA Advanced Research Computing, CHOOSING THE CORRECT STATISTICAL TEST IN SAS, STATA, SPSS AND R https://stats.oarc.ucla.edu/other/mult-pkg/whatstat/. Accessed 6/15/2022.
Van Ewijk, R. (2011). Same work, lower grade? Student ethnicity and teachers’ subjective assessments. Economics of Education Review, 30(5), 1045–1058.
Warikoo, N., Sinclair, S., Fei, J., & Jacoby-Senghor, D. (2016). Examining racial bias in education: A new approach. Educational Researcher, 45(9), 508–514.
Weiner, B. (1972). Attribution theory, achievement motivation, and the educational process. Review of educational research, 42(2), 203–215.
Weiner, B. (Ed.). (1974). Achievement motivation and attribution theory. General Learning Press.
Weiner, B. (2004). ATTRIBUTION THEORY AND ORGANIZATIONAL PSYCHOLOGY. Attribution theory in the organizational sciences: Theoretical and empirical contributions, 1.
Weiner, B. (2012). An attribution theory of motivation. Handbook of theories of social psychology, 1, 135–155.
Weiner, B. (2018). Attribution theory in organizational behavior: A relationship of mutual benefit. In Attribution Theory (pp. 3–6). Routledge.
Williams, J. K. (2008). Unspoken realities: White, female teachers discuss race, students, and achievement in the context of teaching in a majority Black elementary school. Oregon State University.
Wood, D., & Graham, S. (2010), "Why race matters: social context and achievement motivation in African American youth," Urdan, T.C. and Karabenick, S.A. (Ed.) The Decade Ahead: Applications and Contexts of Motivation and Achievement (Advances in Motivation and Achievement, Vol. 16 Part B), Emerald Group Publishing Limited, Bingley, pp. 175–209. https://doi.org/10.1108/S0749-7423(2010)000016B009
Zhan, X., Yu, X., Peng, C., & Xi, J. (2021, March). Interpreting Teacher Expectation in Teacher Feedback and Finding the Relationship with Student Learning. In Society for Information Technology & Teacher Education International Conference (pp. 991–998). Association for the Advancement of Computing in Education (AACE).

Table 1 - Sample Descriptives

Sample Descriptives

	Full Sample Mean (SD)	Jake Mean (SD)	Jamal Mean (SD)
Respondent Characteristics
Male	.48 (.50)	.41 (.49)	.54 (.50)
Rubric Experience	.53 (.50)	.49 (.50)	.56 (.50)
Age Range
18-25 years old	.34 (.48)	.28 (.45)	.40 (.49)
26-35 years old	.42 (.49)	.44 (.50)	.39 (.49)
36-50 years old	.20 (.40)	.21 (.41)	.18 (.38)
51+ years old	.05 (.21)	.07 (.26)	.02 (.13)

Subject taught1
	ELA	.27 (.44)	.31 (.46)	.23 (.42)
	Math	.15 (.35)	.09 (.29)	.21 (.41)
	Science	.14 (.34)	.15 (.36)	.12 (.33)
	Social Studies	.08 (.27)	.07 (.26)	.08 (.28)
	Other	.30 (.46)	.30 (.46)	.30 (.46)
Outcome Variables
	Mastery Level (range 1-4)	2.63 (.77)	2.65 (.67)	2.60 (.86)
	Name Only	2.52 (.69)	2.57 (.64)	2.47 (.74)
	Positive Profile	2.93 (.80)	2.91 (.64)	2.94 (.95)
	Negative Profile	2.43 (.73)	2.45 (.66)	2.40 (.78)
Criteria Scores
	All Criteria	.46 (.50)	.46 (.50)	.46 (.50)
	Criteria 1 – Thesis	.60 (.49)	.61 (.49)	.58 (.49)
	Criteria 2 - Evidence	.51 (.50)	.51 (.50)	.51 (.50)
	Criteria 3 – Sophistication	.28 (.45)	.26 (.44)	.29 (.46)
	N	219	110	109

Notes. 1 Subject taught does not add up to 100 due to an N/A response category (e.g., these respondents may teach in an elementary school).

Table 2 - Common Attributions

Weiner’s Attributions (adapted from Weiner 1974)

Stable

Unstable

Internal

Fixed Ability, Inherent Skill, Typical Effort

Teacher Feedback: Fixed Ability, Inherent Skill

Refers to Static Characteristics and Inherent Ability

(You are a natural writer)

Malleable Ability, Mood, Emotion

Teacher Feedback: Refers to Malleable Ability and Growth

(You can be a good writer)

External

Rigor and Difficulty of Task/Standard

Teacher Feedback in the Study: Rigor and Difficulty of Task/Standard Refer primarily to a standard of writing

(There is a standard for writing that you must achieve.)

Occasion Noise (Kahneman et al. 2021), Luck

Refers to uncontrollable varied external factors

“(It wasn’t your day)”

Table 3: Profile Type by Implied Race and Attribution in Feedback

	Internal, Stable	External, Stable	Internal, Unstable	External, Unstable
	Fixed Ability	Rigor of Task/Standard	Malleable Ability	Luck
Overall	16.89%	21.46%	46.58%	15.07%
White Implied Name (Jake): All Profile Types	13.64%	21.82%	44.55%	20.00%
Black Implied Name (Jamal): All Profile Types	21.10%	20.18%	48.62%	10.09%
White Implied Name (Jake): Name Only	13.16%	21.05%	47.37%	18.42%
Black Implied Name (Jamal): Name Only	13.89%	22.22%	55.56%	8.33%
White Implied Name (Jake): Positive	18.92%	27.03%	32.43%	21.62%
Black Implied Name (Jamal): Positive	30.56%	11.11%	47.22%	11.11%
White Implied Name (Jake): Negative	8.57%	17.14%	54.29%	20.00%
Black Implied Name (Jamal): Negative	16.22%	29.73%	43.24%	10.81%

Note: Bolded number = highest percentage; red and italics = second highest percentage of scores.

Table 4 - Attribution Error and Mastery Scores

Attribution error by mastery scores that teachers assigned alongside their open-ended feedback to the student. Similar patterns found with criteria scores.

	Internal, Stable	External, Stable	Internal, Unstable	External, Unstable
Weiner Attribution	Fixed Ability	Rigor of Task/Standard	Malleable Ability	Luck, Occasion
Example Feedback	Writing is a skill some people just have. You have it or you don't.	Writing includes…[these fixed standards of quality]	Writing takes work; keep trying.	It was just one of those days.
Attribution By Mastery Scores - Overall	16.89%	21.46%	46.58%	15.07%
Mastery Score of 1: No Signs of Mastery	0.00%	38.46%	46.15%	15.38%
Mastery Score of 2: Approaching Mastery	12.35%	28.40%	43.21%	16.05%
Mastery Score of 3: Mastery	18.18%	16.16%	49.49%	16.16%
Mastery Score of 4: Exceeds Mastery	34.62%	11.54%	46.15%	7.69%
Mastery Scores of 3 and 4: Mastery	21.60%	15.20%	48.80%	14.40%
Mastery Scores of 1 and 2: Non-Mastery	10.64%	29.79%	43.62%	15.96%

Note: Bolded number = highest percentage; red and italics = second highest percentage of scores.

Table 5 - Profile Type, Attribution, and Feedback

	Internal, Stable	External, Stable	Internal, Unstable	External, Unstable
	Fixed Ability	Rigor of Task/Standard	Malleable Ability	Luck
Name Only	13.51%	21.62%	51.35%	13.51%
Profile: Positive	24.66%	19.18%	39.73%	16.44%
Profile: Negative	12.50%	23.61%	48.61%	15.28%

Note: Bolded number = highest percentage; red and italics = second highest percentage of scores.

Table 6 - Profile Type, Attribution, and Feedback

Unknown Ability (Name Only Profile)
	Stable	Unstable
External	21.62%	13.51%
Internal	13.51%	51.35%

Implied High Performing Student (Positive Profile)
	Stable	Unstable
External	19.18%	16.44%
Internal	24.66%	39.73%
Implied Low Performing Student (Negative Profile)
	Stable	Unstable
External	23.61%	15.28%
Internal	12.50%	48.61%

Note: Bolded number = highest percentage; red and italics = second highest percentage of scores.

Table 7 - Examples of Feedback and Attribution Coding

The following examples are included to highlight the coding process in this study.

Internal Stable (Fixed Ability)

External Stable (Task/Standard Difficulty)

External Unstable (Occasion Noise: Luck, Situation Circumstances)

Internal Unstable (Effort, Growth)

“I would try to calm the student down and then explain that he doesn't have to be nervous about the next essay. I would point out that he has sufficient skills plus I would give him a few extra tips based on the last easy to make the next essay eould be even better.”

“Have a structure in your essay, something like this:

- topic sentence

- introductory sentence

- argument or counterargument no 1 followed by details and examples

same for argument no 2 or 3

conclusion, by summarising + offering concluding remarks and reinforcing your position

tips: do not write an argument without offering details, it adds 0 value”

“You don't have to be afraid of the next essay. Try to get to grips with the subject matter and give valid arguments for and against it.”

“It is part of life to be nervous. You need to believe in yourself. If you worked hard you will get a good grade!”

“Try to research your topic better and find more convincing reasoning. Summarize your ideas in a conclusion to once again support your stance.”

No competing interests reported.

Feedback.png

Download PDF

Version 1

posted

You are reading this latest preprint version

Investigating Racial Bias and Attribution Error in Grading Student Performance

Status:

Version 1

Abstract

Figures

Educational Impact and Implications Statement

Introduction

Standards-based Grading, Defined

Heuristics

Attribution Error

Racial Bias

Methodology

Research Design: Factorial Vignette Experiment Study

Name Selection

Profile Type Design

Scoring Rubric Design

Sampling

Measures

Analysis

Data were analyzed using Qualtrics (2022). We first discuss analysis of the scores and then analysis of the teacher feedback.

Quantitative Data Analysis

Qualitative Data Analysis

Limitations

Findings

Attribution Error

Discussion

Future Research: Can attribution error be minimized?

Conclusion

Declarations

Author Contribution

Acknowledgement

Data Availability

References

Tables

Table 4 - Attribution Error and Mastery Scores

Table 5 - Profile Type, Attribution, and Feedback

Table 6 - Profile Type, Attribution, and Feedback

Table 7 - Examples of Feedback and Attribution Coding

Additional Declarations

Supplementary Files

Status:

Version 1