Do Fairly-Decided Maltreatment Determinations Significantly Reduce Recidivism? A Quasi-Experimental Evaluation of a System-Level Intervention Implementation

Two studies examined the impact of the implementation of the Field-tested Assessment, Intervention-planning, and Response (FAIR) system, a system-level intervention for determining whether allegations of family maltreatment meet threshold for abuse or neglect, on alleged recidivism. Data were collected at the 10 U.S. Army installations with the largest family maltreatment caseloads. Participants were family members who had an allegation of family maltreatment (i.e., child maltreatment or partner abuse) during one of the two study periods. Data were collected when Family Advocacy Program staff used the then-in-place system (Case Review Committee) and later the FAIR system. In Study 1, cases were followed for 6 months following the initial maltreatment allegation to measure the occurrence of subsequent allegations of any type. Additionally, at five installations, alleged victims of partner abuse were recruited into a study (Study 2) in which they anonymously reported on intimate partner violence via telephone. In Study 1, the advantage for the FAIR condition was concentrated in cases with unsubstantiated initial determinations; the mean relative risk reduction for recidivism was 0.48. In Study 2, FAIR extended median time to recidivism by approximately 170%. These results replicate and extend earlier findings that employing the FAIR system can result in decreased family maltreatment re-offense.

where the criteria and procedures were refined. These studies established that the system worked well in real-world conditions (e.g., agreement between local decisions and master reviewers >90%), even when disseminated broadly with non-volunteer sites. In addition, FAIR resulted in substantially lower rates of re-offense (Snarr et al., 2011). On the basis of these findings, FAIR's criteria were adapted for the International Classification of Diseases, 11th Edition (ICD-11;World Health Organization, 2021) and the Diagnostic and Statistical Manual of Mental Disorders-5th Edition (DSM-5; American Psychiatric Association, 2013). The FAIR system has also been adopted by Alaska's child welfare system .
FAIR's results regarding reliability and consistency in decision making is unique in healthcare and child welfare. First, most behavioral disorder determinations have poor inter-rater agreement (Rettew et al., 2009). In contrast, FAIR resulted in 90% agreement with master reviewers . Second, determinations under FAIR are perceived as fair to alleged offenders and victims by stakeholders (Heyman et al., 2010). Third, FAIR shows evidence of negligible systematic bias (Heyman et al., 2016). Finally, compared with the military's previous system, FAIR cut 1-year reoffense rates for those who met criteria in half (Snarr et al., 2011), suggesting a tertiary preventive effect of a fairer and more consistent system.
The preventative effect of the FAIR system is perhaps the most important evidence supporting its further dissemination. This effect was found in a quasiexperimental study of the Air Force comparing cases nested within sites during the final window of the use of their prior system and comparing that to cases nested within sites during a comparable window after switching to FAIR (Snarr et al., 2011). Although not directly testable, we interpret the preventative impact as a function of all stakeholders perceiving the FAIR system to result in more fair determinations. We believe that this, along with the clarity of the definitional criteria in FAIR, made it easier for commanders and others to present the behavioral standards clearly to service members and families and for them to send more consistent messages about the need to not cross the line that definitions draw. This multiparty clarity of communication is a key element of the coordinated community response model employed by the military. However, it cannot be ruled out that another factor, such as decreased reporting of maltreatment to authorities, contributed to the reduction in re-offense because only official allegations were examined in that study (Snarr et al., 2011).
Study 1 replicated and extended the evaluation of recidivism in the U.S. Air Force by Snarr et al. (2011). The study was conducted in the U.S. Army, where the definitional criteria used by FAIR had been implemented; however, FAIR's recommended training, structured assessments, and reorganized committee determination had not been adopted. In this study, we compared the Army's existing system (Case Review Committee [CRC]) with the fully implemented FAIR system at 10 large Army installations. We hypothesized that cases processed through the fully implemented FAIR system, compared with those processed through the CRC system, would have lower rates of alleged re-offense. Following Snarr et al. (2011), we further explored the moderation of these effects by victim age and substantiation status of the initial allegation. In Study 1, recidivism was solely based on official allegations. To address this concern in Study 2, we supplemented initial official partner abuse records with anonymous victim reports of subsequent intimate partner physical, psychological, and sexual abuse.

Study 1
Permission to collect data was obtained from the Army's Human Research Protection Office, and New York University Institutional Review Board approved the study protocol.

Method
Participants. There were 14,611 FAP incident records during the study period from 10 Army installations with the largest FAP caseloads. FAP cases include parent-child and partner physical, emotional, and sexual abuse, as well as child neglect. All Army sites directed by Army leadership to participate in the research project (and full FAIR implementation) did so. Records randomly selected for coding by research staff (8,231 records nested in 4,673 families) were used for the present analysis; all records that fell in the designated time frame were randomly ordered using a random number generator, and records were coded in that order as long as the study period permitted.
Design. Utilizing a quasi-experimental design, recidivism was compared in the CRC (5,143 incidents between November 2016 and August 2017) versus FAIR (3,088 incidents between October 2017 and July 2018) implementation phase conditions. In the CRC phase, maltreatment decisions were evaluated via Case Review Committee, the system in place in the Army at the outset of the study. In the CRC, assessing social workers presented their own cases, commanders did not vote, and the committee members outside of the FAP were not trained in the definitions. Discussion included incident-related information and treatment recommendations. Additional details can be found in Heyman et al. (2022). In the FAIR phase, maltreatment decisions were evaluated via the updated committee structure and procedures and in meetings that separated substantiation decisions from clinical treatment discussions. The committee was chaired by an installation leader, a supervising social worker presented cases after review with the assessing social worker (who was not present), all members were trained in the definitions and procedures and passed a quiz certifying mastery, and each involved Soldier's commander was trained and voted along with the rest of the committee. At all Army installations, CRC implementation immediately preceded FAIR implementation. Because we needed 6 months to pass from the meeting date to allow for recidivism to occur, and due to study timeframe cutoff, the CRC phase allowed for more records to be reviewed. Because records were reviewed in random order in each phase, this does not introduce bias.
Measures. All data were drawn from official FAP records. To protect anonymity, a very limited set of data lacking characteristics (e.g., race/ethnicity) that might, in combination, inadvertently identify individuals was provided to the research team.
Alleged initial maltreatment and recidivism. The presence/absence of parent-child and partner physical, emotional, and sexual abuse, as well as child neglect, was judged from official allegations received by FAP personnel. Upon receiving an initial allegation, FAP intake workers categorize it into one of the above categories using domestic abuse definitions as specified in the Decision Tree Algorithm (Department of Defense, 2016). To operationalize recidivism, maltreatment was coded in the 6-months following initial reports for whether new allegations of any type were recorded (yes/no; 1/0). The 6-month follow-up period was chosen as the longest period possible within the practical constraints of the study length.
Initial substantiation determination. Each initial allegation in the study period was categorized as meeting (1) versus not meeting (−1) criteria for maltreatment.
Military status. The alleged perpetrators' and victims' military statuses were coded as service member (1) or civilian (−1).

Analytic Method
Regression models were conducted using Mplus version 8 (Muthén & Muthén, 2017). The nesting of participants within families was handled via robust pseudo maximum likelihood estimation method ("type = complex" in the Mplus analysis specifications), which utilizes a sandwich estimator to adjust parameters' SEs for the non-independence of clustered observations (Asparouhov & Muthén, 2005).
We first evaluated differences in the CRC and FAIR phases to screen for potential confounds on available incident characteristics: offender and victim gender and military status (service member vs. civilian), allegation substantiation, maltreatment severity, maltreatment victim age group (child vs. adult partner), and maltreatment type (emotional abuse, physical abuse, sexual abuse, and neglect). In each case, nine effects coded (i.e., 1 vs. −1) dummy variables for installation (reference installation is the 10th) were included as covariates. These analyses are summarized briefly in the "Results" section and more fully presented in the Supplemental Material to this article.
To test the main effect for phase, we evaluated a logistic regression model in which alleged recidivism (1 = present; 0 = absent) was regressed on effects coded variables for phase (−1 = CRC; 1 = FAIR), as well as covariates representing installation and the two variables identified in the confound screening (maltreatment type and victim age group).
To test the Phase × Victim Age Group × Allegation Substantiation interaction, we evaluated a logistic regression model that included all lower order main effects, as well as two-way and three-way interaction terms among phase, victim age group, and allegation substantiation. The covariate set described for the main effect model was also included in this model. The interaction was probed graphically and simple slopes (SS) were calculated per Preacher et al. (2006).
Due to missing data, Ns varied slightly from analysis to analysis, ranging from 8,192 to 8,213 incidents nested in 4,657 to 4,666 families.

Results
Descriptive statistics are presented in Table 1 for the incident characteristics and in Table 2 for the recidivism rates by initial substantiation decision per condition. The sample-wide recidivism rate was 10.9%.
Of the variables evaluated as potential baseline confounds, there were significant differences by phase for maltreatment victim age group and emotional abuse (Table 3; Supplemental Tables S3 and S4). Compared to the CRC phase, there were proportionately more adult partner than child victims and an overrepresentation of emotional abuse incidents in the FAIR phase. Accordingly, variables representing installation (see "Analytic method") were supplemented with effects coded variables for age category (1 = adult; −1 = child) and maltreatment type (emotional, physical, and sexual abuse; reference category is neglect; 1 = present, −1 = absent) as covariates in the following tests of main and interactive effects.   (see Supplemental Tables S2-S5); FAIR = Field-tested Assessment, Intervention-planning, and Response; CRC = Case Review Committee. a 1 = FAIR, −1 = CRC; b 1 = male, 0 = female; c 1 = service member, 0 = civilian; d 1 = substantiated, 0 = unsubstantiated; e 0 = did not meet criteria, 1 = mild, 2 = moderate, and 4 = severe; f 1 = child, 0 = adult partner; g 1 = present, 0 = absent.
The main effect of phase was statistically significant per the results of logistic regression (Table 4). Based on this model, the predicted probabilities of alleged recidivism were lower in the FAIR (.072) than the CRC (.097) phase.
The Phase × Victim Age Group × Allegation Substantiation interaction was significant as well (Table 5; Figure 1). SS analyses showed that the lower recidivism rate in the FAIR, relative to the CRC, phase was observed for both child (

Study 2
Permission to collect data was obtained from the Army's Human Research Protection Office, and New York University Institutional Review Board approved the study protocol.

Methods
Participants. FAP social workers identified Soldiers or partners (N = 88) with allegations indicating victimization by intimate partner physical, psychological, or sexual abuse, referring these individuals to an on-site researcher. The Study 2 participants were distinct from those of Study 1.
Design. Utilizing a quasi-experimental design, partner abuse recidivism was compared in the CRC versus FAIR implementation phases (see Study 1 description). Four of the sites in the full study were assigned to this protocol. Of the 88 participants, n = 46 were recruited during the CRC phase; n = 42 were recruited during the FAIR phase.
Measures. Alleged partner abuse recidivism was assessed via victim-report using an automated telephone assessment system administered by Northern Illinois University. All responses involved pushing "1" for "yes" and "2" for "no" and were anonymous. The automated assessment included eight items measuring partner abuse (emotional, physical, or sexual) based on the definitions of partner abuse in FAIR. Participants were asked to call weekly for 4 weeks; then monthly for 6 months. They were paid $10 to hear about the study and an additional $10 was uploaded to a prepaid debit card each time they made a call to the assessment system (up to $110). Partner abuse, of any type, was scored as present (1) or absent (0) across all waves of assessment.
To protect anonymity, no additional data (e.g., demographics) were provided to the research team.

Analytic Method
Survival and logistic regression analyses were conducted with Mplus version 8, using the robust pseudo maximum likelihood estimation method described for Study 1. To test the effect of phase on time to recidivism, a Cox proportional hazards model (Singer & Willett, 2003) was estimated in which time to recidivism was regressed on phase. To test the effect of phase on the presence versus absence of recidivism, a logistic regression model was estimated in which abuse was regressed on phase (CRC = 1; FAIR = 2). Ten is reference installation; b neglect is reference maltreatment type; c phase is coded CRC (−1) and FAIR (1); d victim age category is coded as child (−1) and adult partner (1); e allegation substantiation is coded as unsubstantiated (−1) and substantiated (1).

Figure 1.
Phase × Victim Age Category × Allegation Substantiation interaction (Study 1). SS for the effect of phase (CRC vs. FAIR) are statistically significant for both child (p = .001) and adult (p = .034) victims for incidents with an unsubstantiated baseline allegation. Neither SS is significant for incidents with substantiated baseline allegations.
Note. SS = simple slopes; CRC = Case Review Committee; FAIR = Field-tested Assessment, Intervention-planning, and Response.

Results
Time to recidivism was significantly lower for cases determined via CRC versus FAIR, B = −.337, SE = 0.064, p < .001, 95% CI: −0.462 to −0.212 ( Figure  2). Median time to recidivism was 33.48 (CRC) versus 90.10 (FAIR) days. The likelihood of recidivism at any time was also significantly lower in the FAIR than in the CRC phase, OR = 0.361 (SE = 0.033, p < .001, 95% CI: 0.311-0.432). Based on predicted probabilities, the recidivism rate was 69.55% in the CRC phase and 45.21% during in the FAIR phase; RRR = 0.35.

Discussion
Results replicate and extend prior findings (Snarr et al., 2011) that the FAIR system for making maltreatment determinations results in notable tertiary preventative effects (i.e., 65% reduction in recidivism in child maltreatment and 30% to 35% reduction in adult maltreatment among unsubstantiated incidents). Snarr et al. (2011), using base-level U.S. Air Force FAP partner and child maltreatment allegation data, could not isolate whether the decrease in reporting under the FAIR system was due to (a) reduced subsequent maltreatment or (b) reduced reporting of maltreatment. The current study included anonymous victim reporting; findings suggest that the FAIR system does, indeed, reduce risk of subsequent maltreatment incidents.
Whereas the Snarr et al. (2011) study found that the reduction in recidivism was driven by the families with previously substantiated cases, the current study finds this reduction in families with previously unsubstantiated cases. It seems important to note that the original substantiation rates in the Snarr et al. (2011) sample (56%, which moved to 47% after the shift to the FAIR) differed from this sample (47%, which moved to 48.5%). In addition, in this study, the same definitions of maltreatment were used in both conditions, and this was not the case in Snarr et al. Therefore, the patterns of substantiation changes and the slightly different interventions may be able to explain the different interactions found in each study, although inconsistency does make the interaction less interpretable.
Military communities' responses to maltreatment use a coordinated community response approach, which translate ecological theories (Bronfenbrenner, 1979) into action. Such theories suggest individuals' behaviors are impacted by nested social systems (e.g., couple/family, workplace, installation, larger civilian community, subculture, culture) at multiple levels. Prior studies have found that both Air Force (Heyman et al., 2010) and Army  unit commanders and first sergeants are more likely to attend FAIR-system meetings and perceive the system as fair. Thus, it is plausible that the expected positive impact on the coordinated community response elicited by including the chain of command directly in the maltreatment decision-making process resulted in lower recidivism.
No other field-tested maltreatment definitional criteria are in the literature, and other systems for maltreatment determinations have notable shortcomings. The FAIR system appears to improve upon standard practice based on prior research Snarr et al., 2011) and the findings in this report. Although designed for use in the U.S. military, the FAIR system has been ported over to one state child welfare system with some evidence of success . Translation to very large maltreatment systems might require tailoring of the FAIR system. However, this study took place in the 10 largest FAPs at U.S. Army installations, which care for nearly 50,000 Soldiers and their families. These settings are likely as demanding as many local civilian child welfare settings. Furthermore, the structure of FAIR includes training in assessment and the structured system, criteria, and computerized decision-making tools, which could reasonably be expected to improve social work practice in non-military settings.
As with all studies, this study has several limitations. First, this study, as with the prior evaluation, used a quasi-experimental design. Participants did not choose, or have awareness of, the condition to which they belonged, thus Note. CRC = Case Review Committee; FAIR = Field-tested Assessment, Intervention-planning, and Response. avoiding a major threat to the validity of causal interpretations in quasiexperimental designs: selection effects (Cook et al., 2002). Furthermore, we controlled for potential confounds that were available in the dataset. However, at all sites, the CRC condition preceded the FAIR condition, thus other threats cannot be decisively ruled out (e.g., history). Yet, in the absence of a specific identifiable threat to the causal validity of our findings, we believe the differences in recidivism were most plausibly caused by the change from the CRC to the FAIR system. Second, the sample size in Study 2 was relatively small, likely due to participation demands, and it may not be representative of all victims. Third, further, we did not have access to information about participants' racial and ethnic identity or many other demographic factors, so we cannot speak on generalizability of our findings across those subgroups nor rule out the possibility that participants in the two groups had demographic differences that might confound group differences in recidivism, although this seems unlikely. Fourth, because these data are anonymous, we cannot evaluate whether Study 2 participants' initial maltreatment cases differed from those not invited to participate or those who chose not to participate. However, the consistency of the findings to those of Study 1, despite the small size, make the generalizability of its findings likely. Finally, these results cannot be considered indicative of the potential prevention impacts of a FAIR implementation in civilian contexts without the necessary evaluation. The military's coordinated community response includes the workplace as an important element in fostering change. Because civilian systems do not have this subsystem of the family's ecology, implementing FAIR there involves less leverage. Alternatively, if the active ingredient in FAIR is the family's understanding of the criteria's bright line indicating maltreatment, it may be that workplace representatives merely emphasized this line, something that could be undertaken by those in the civilian system with whom families interact. Further research is necessary to understand these pathways.
To date, the FAIR system has been implemented across the U.S. military and Alaska (Mitnick et al., 2021). Given the current studies' replication that the FAIR system reduces subsequent incidents of maltreatment, wider dissemination of the FAIR system to make substantiation determinations in U.S. and international jurisdictions should be considered.

Conclusion
Clear and consistent decisions about whether alleged incidents of child maltreatment or intimate partner abuse are founded contribute to the perceived fairness of the process and support bright-line distinctions about what constitutes maltreatment, which in turn, support healthy community norms around parenting and intimate relationships. Although not all families who encounter formal systems are able to learn and apply the distinction between maltreating and non-maltreating behaviors, some can. FAIR reduces re-offense without added time or expense to the system (see Heyman et al., 2022) or the families once implemented. Thus, the cost effectiveness for reduction in injury, mental health and developmental consequences, and mortality is potentially substantial.
The FAIR system is implemented via policy throughout the U.S. Department of Defense and the state of Alaska. Standardized trainings, automation systems, and quality assurance systems have been developed and are being routinely used in most settings, with no support from the system developers. This suggests the potential for sustainability is high. Research to date, including that presented here, suggests the FAIR system would represent a notable improvement over current practice for making substantiation decisions.

Disclaimer
The views expressed are solely those of the authors and do not reflect the official policy or position of the U.S. Army.

Funding
The author(s) disclosed receipt of the following financial support for the research and/ or authorship of this article: US Department of Agriculture 2015-48783-24394.

Supplemental Material
Supplemental material for this article is available online.
anger, conflict, and violence in families and its prevention. She is interested in leveraging implementation science to further system improvements to prevent violence.

Richard E. Heyman, PhD, is professor in the Department of Cariology and
Comprehensive Care at New York University. He earned a BS from Duke University and a PhD in clinical psychology from the University of Oregon. He's a licensed psychologist. Dr. Heyman has received over 60 grants/contracts from major U.S. funding agencies on a variety of topics, from dental fear to social determinants of health to couples communication to community-level prevention of family maltreatment, substance problems, and suicidality. Dr. Heyman has published over 200 scientific articles/chapters on these topics. At the core of Dr. Heyman's research is translating basic knowledge into prevention and treatment and on improving adoption of evidence-based practices. Michael F. Lorber, PhD, is a senior research scientist with the Family Translational Research Group in the College of Dentistry at New York University. His primary research interests are centered on externalizing behaviors-their form, development, etiology, consequences, and prevention-from infancy through adulthood, primarily in relational (e.g., parent-child; couple) contexts.
Sara R. Nichols, PhD, is a developmental and clinical psychologist and an adjunct assistant professor at New York University College of Dentistry. Her research interests include early social development, developmental psychopathology, and implementation of evidence-based interventions.
Daniel F. Perkins, PhD, is a professor of Family and Youth Resiliency and Policy at the Pennsylvania State University. As Principal Scientist of an applied research center, the Clearinghouse for Military Family Readiness, he translates science into action through science-based programs and technical assistance to professionals serving military/veterans families. His work involves implementation, hybrid evaluations of interventions, and implementation science.