This scoping review contributes uniquely to research on peripheral and neural correlates of self-harm by summarizing data on children and adolescents ages 3 to 19 years, a demographic with social, developmental, and psychological characteristics of self-harm that can differ from those found in young adults [17, 28, 30, 31, 34–37]. This contrasts with previous reviews, which combined data from this age group with adult information[26–33]. Our work also advances knowledge on this topic by reviewing 79 studies in 76 publications, significantly more studies than in earlier reviews and by covering 45 years from 1985 to 2020.
Publication timelines differed by the types of self-harm focused upon in the studies. Studies on suicidality in this age group have consistently been published since 1985, but those on any self-harm did not show up until 2005, and the first correlates-NSSI paper on children and adolescents did not come out until 2012. Publication of studies on correlates of all three sub-types of self-harm have been increasing since 2009. Our finding is similar to that reported in a review of MRI studies in adolescents and young adults, although publications in that dataset didn’t start until 2008 and the review did not separately report findings in adolescents .
The research was conducted in nine countries, with most of the studies done in the US. Two-thirds of the studies investigated suicidality, with the remaining third evenly divided between studies of subjects with NSSI or any type of self-harm. Of the 79 studies, 48 focused on peripheral and 31 on neural correlates. Studies were organized into seven correlate sub-categories under peripheral correlates (Stress Response System; Serotonin System; Sleep; Immune System; Lipid Metabolism; Neuromodulators; and Pituitary Hormones) and three within the neural correlates (Brain Function, Imaging; Brain Function, Non-Imaging; and Brain Structure).
A total of 28 specific correlates were investigated. Data came from studies which were all observational in design, mainly using convenience samples of patients who were adolescents, 12-19 years of age, about half of whom were recruited based on specific disorders (primarily MDD) and which were disproportionately female. From the limited information available, the samples may have been predominantly white participants.
Over a quarter of the specific correlates (29%, 8/28) were investigated with only one study: circadian rhythm of growth hormone, reactivity of growth hormone, functional imaging and response inhibition, function imaging and decision-making, ERPs and attention capture, brain wave asymmetry, and intracortical inhibition. For the remaining 21 specific correlates, inter-study agreement on findings was often low, even within one type of self-harm. We found no replication studies.
Resolution of inter-study divergence in findings is challenging because of heterogeneity in studies on multiple levels. For example, we found considerable variability in findings from HPA axis reactivity studies. But these studies defined self-harm in myriad ways, using a diagnostic instrument, or one of two different clinician-rated instruments, or one of two different self-reports. Moreover, these studies variously used subjects who were patients with MDD or patients not recruited on the basis of a diagnosis and they were compared to an assortment of healthy, psychiatric, or sibling controls. Finally, there were four different methods used to activate the HPA axis, only one of which has been demonstrated to be a reliable and valid paradigm for measuring HPA axis reactivity in children and adolescents . Similar study heterogeneity characterizes the functional imaging studies of social interaction, which we also described in the results section as another example of low inter-study agreement in findings.
Most of the studies reviewed used case-control designs, which are vulnerable to selection and information biases, as well as confounding[126–128]. These problems all need to be mitigated by adequate sample size, minimization of classification error (including self-harm classification), and attention to the representativeness of participants. As can be seen in the examples described in the ‘Summary of study findings by type of self-harm’ sub-section of the Results section and in the previous paragraph, many of the studies fall short on one or more of these features. Consequently, only 37% of our studies were deemed at relatively low risk of bias.
Studies which did have similar methods and were rated as Good, did report similar findings, whether positive associations of the specific correlate with self-harm[86, 87] or no association with self-harm [81, 83]. It is also important to note that these were four independent samples from four different research groups.
Although the patient samples from earlier studies may have been representative of self-harming children and adolescents in the 1980s and early 1990s, current information suggests that is no longer the case. Up to 60% of adolescents with NSSI in the general population do not seek care  and half of the adolescents with suicidality or NSSI in a population study do not present for help. Moreover, the ability to access care can be compromised by low socioeconomic status , rural geographic location, or minority race/ethnicity , thus reducing the generalizability of some of the findings for those who experience healthcare disparities. Similarly, recruiting participants based on a psychiatric disorder limits the applicability of results to the sub-population of self-harming children or adolescents with that disorder, despite evidence that self-harm can be transdiagnostic or exist independent of psychiatric disorders[134–136].
Numerous diagnostic interviews, clinician-rated instruments, or self-report instruments were used to identify self-harm, which may have produced information bias. The use of multiple instruments also is troubling because recent reviews of child and adolescent self-harm instruments have questioned their psychometric properties[137–139]. The reviews also point out possible threats to validity if an instrument is used in samples different from those used in development of the instrument or when used for purposes other than which they were initially designed. Unfortunately, these were common practices in the research we reviewed. Moreover, data from adults reporting self-harm shows that 40% of those responding affirmatively to a question about attempting suicide denied intention to die when asked a follow-up question , suggesting that single questions may be misleading and clarification of behavior definitions for respondents may be critical. Many of our studies classified self-harm based on one question taken from an instrument not even designed to measure self-harm.
Similar issues arose in the measurement of correlates or outcomes. Confounding was rarely handled by standard methods such as stratification or propensity scores  and researchers sometimes measured unique outcomes of specific correlates, making inter-study comparisons or interpretation of different findings challenging.
We identified four research gaps: 1) the absence of replication studies; 2) a paucity of studies on children younger than 11 years of age; 3) relatively few studies on non-patient children or adolescents, and 4) disproportionate representation of girls. A possible gap is the lack of data on non-white children and adolescents, but we could not confirm this.
If left unfilled, these gaps will significantly impede progress in this field. Replication studies can help verify that an association between self-harm and a specific correlate is not a spurious finding and they are a critical step in the development of all types of biomarkers . However, replication studies are not always easy or fruitful, although several strategies have recently been proposed to determine which studies in a body of research should undergo replication[143–145]. Future research on correlates in child and adolescent self-harm should focus on replication attempts of some of the findings described in this review using these new methods to identify which ones.
More studies on correlates of self-harm in younger children are needed, as self-harm is increasing in younger age groups [146, 147]. For example, presentations to US emergency departments for suicidality increased substantially from 2007 to 2015 (the most recent data available) and 43% of those visits were from children 5-10 years of age. Moreover, suicide was the third leading cause of death in the US for younger children (10-12 years of age) (https://webappa.cdc.gov/sasweb/ncipc/leadcause.html) and the characteristics of younger children with self-harm are different from adolescents .
Gender proportions are essential to balance in the research landscape. Girls are more likely to engage in suicidality and NSSI, so samples comprised mostly or entirely of females can be appropriate, depending on the research goals. But results from such studies cannot be generalized to boys. Furthermore, given gender differences in help-seeking, it is unlikely that many boys with self-harm will be found in clinical settings.
New research studies must increase the number of non-patient children and adolescents under investigation. It is also essential that samples are not only more diverse with regards to gender, but also for race and ethnicity, as recent data show that from 1991 to 2017 suicide attempts among black adolescents in the US rose 73%, compared to a decrease of 7.5% in white adolescents .
There were several strengths in this body of research, including a larger number of studies and more specific correlates under investigation than we expected based on previous reviews. In addition, cohort studies using peripheral or neural correlates to predict changes in or new development of self-harm, and studies using correlates to better understand responses to treatments both lay the groundwork for prognostic and treatment biomarkers.
The original goal for our scoping review was to prepare for a systematic review and meta-analysis in the service of identifying correlates with potential to move to the next stage of biomarker development. Clinical biomarker development requires that there is accurate selection of the clinical population, a feasible and standardized process for biomarker data collection and processing, and replicability of results in the appropriate sub-populations [149, 150]. After these criteria have been satisfied, characteristics of the marker such as sensitivity, specificity, PPV, and NPV  must be established.
Progression to the next stage of biomarker development is not possible for any of the peripheral or neural correlates identified in our review, due to the small numbers of studies, concerns about self-harm classification, variability of findings, and methodologic weaknesses for each specific correlate. These features also mean that it is unwise to use the current research to generate pathoetiologic theories or develop new treatments.
But this body of work could serve as an excellent platform for biomarker discovery if four improvements are made in future research. The first and most important improvement pertains to the classification of self-harm. In the early to mid-2000s, there was widespread discussion of whether suicidality and NSSI lay on a continuum, i.e., with a predictable pattern of progression from NSSI to suicidal behavior or on a spectrum, i.e., co-occurring disorders that partially overlapped in characteristics and etiology, but were distinct clinical syndromes. The concept of a spectrum gained momentum, culminating in the US with the designation of NSSI and suicidal behavior disorder as separate disorders in need of further study in psychiatry’s Diagnostic and Statistical Manual (DSM)-5.
However, concerns remain that this approach may be difficult to use, based as it is in self-report about intention to die when engaging in self-harm. Some researchers assert this two-category conceptualization of self-harm has been inadequately validated , with concerns that investigations based on this schema will lead to invalid phenotyping [154–158].
To continue to acquire knowledge about correlates of self-harm in children and adolescents, despite disagreements about the phenomenology, we suggest several changes in methods used to classify child and adolescent subjects with self-harm. Studies should collect comprehensive information about all types of self-harm, even if the study aim is to focus on one type. This will optimize the chance that homogeneous samples will be created. To increase the validity of classification, instruments with good psychometric properties in children and adolescents should be used. Approaches using one or two questions from instruments measuring other constructs such as MDD or BPD are ill-advised . Furthermore, as the type of instrument, e.g., self-report checklist vs. clinician-rated instruments, can produce different prevalence rates of self-harm , we recommend classifying self-harm in subjects based on a transparent integration of data from several types of instruments . We also recommend more research on the use of cognitive tasks (instead of, or in addition to self-report) to classify self-harm in children and adolescents, especially in younger children .
The second set of improvements involves minimizing bias in future correlate studies. All the methodologic issues in design, sample construction, correlate and outcome measurement discussed in this review are well-described in the epidemiologic literature. However, to ensure that bias and measurement errors are maximally mitigated, we suggest that researchers use one of the risk of bias instruments in the study development phase as a planning guide .
Advancement in this field will be stalled unless measurement of specific correlates and outcomes is standardized across studies using validated and reliable methods. The third improvement needed for future research is that researchers working in each specific correlate area should agree on how to best measure those correlates and associated outcomes to minimize chances that low inter-study agreement is due to information bias or measurement error.
Most of the 28 specific correlates investigated in our dataset were derived from research on adults. Our fourth recommendation is to encourage future researchers to search for new potential correlates specifically for child and adolescent self-harm. One possible source of new correlates is post-mortem studies of completed suicide in the pediatric age range. No post-mortem studies met our inclusion criteria for this review. But studies modeled after the pioneering work of Pandey and colleagues on  need to be conducted in 3-19- year-olds who have completed suicide. Another source of possible correlates are genome-wide association studies not just of persons who have completed suicide, but of those with other features of suicidalityor NSSI . Most of this work to date has been in adults, but studies specifically examining children and adolescents in large datasets could be very useful.
A final approach to identify new potential correlates or biomarkers is the use of machine learning, either with electronic health record (EHR) data  or in analyzing neural signatures in response to cognitive tasks . There are many reservations in the use of these new approaches [169, 170], but as the machine learning field matures, strategies such as these may provide promising leads.
While having several strengths, our review also has limitations. First, we only obtained papers written in English, so may have missed important studies on the topic not written in English. Second, our search used only two databases, PubMed and Embase. However, these two cover medical and biomedical research from 1947 to the present, including Medline, conference abstracts, ebooks, and citations in non-medical journals. PubMed has 25 million records, while Embase has 29 million. Therefore, we do not think this search strategy missed studies, but it is possible. Third, we did not search the grey literature, nor did we write to prominent authors looking for unpublished studies, especially those with negative findings. Publication bias thus might explain why nearly every study in our dataset reported some association between self-harm and the specific correlate under investigation. Fourth, our categorization of self-harm studies was based on how investigators described their populations of interest or samples. Our classification system was too high-level for us to report on the more nuanced features of suicidality, e.g., suicidal plans, ideation, attempts or on specific NSSI behaviors, e.g. cutting or burning. Future researchers will likely want more detail on specific behavioral manifestations, but if so, details on the studies are supplied in Supplements 3 and 4.