Our study develops and applies a set of complementary methods that can be applied for prediction of outcome variability in aneuploidy and CNV carriers based on family data. By using these methods to model data from families with male probands carrying an extra Y chromosome, we (i) refine insights into the impact of XYY on several aspects of early cognitive and behavioral development, and (ii) show the general utility of family-based analyses for resolving interdependencies between neurobehavioral outcomes in aneuploidy or CNV carriers, and neurobehavioral profiles in their first degree, non-carrier relatives. We address the implications of our results below and consider important steps for future work.
The correlation analyses in this study indicate that the degree of coherence between trait variability across aneuploidy-carrying probands and their unaffected family members can vary greatly across traits. The strongest positive proband-family trait correlations were seen for FSIQ (0.63). Similar magnitudes of proband-family FSIQ correlations have also been reported in several other aneuploidies and CNVs [22q11.2.2 deletion syndrome: , 16p11.2.2 deletion syndrome: , Down syndrome: , and Klinefelter syndrome ], as well as in the general population [10,37]. This suggests that although most neuropsychiatric aneuploidies and CNVs significantly impact FSIQ, many do so without disrupting the other sources of variance that underpin the well-documented familiality of these traits in the general population . In other words, the causal pathways that mediate the negative impact of many aneuploidies and CNVs on FSIQ may be partly distinct from those that explain intrafamilial FSIQ correlations. This principle may not hold equally for vocabulary and matrix reasoning, however – although formal tests of this notion must await direct comparisons in larger samples than ours of family-proband correlations for different traits. In our present study of XYY syndrome, and in past studies of XXY syndrome and 16p11.2 deletion disorder, the correlation between probands and first-degree relatives was stronger for verbal than non-verbal subcomponents of general cognitive ability [4,36].
Further, in contrast to the robustly positive proband-family correlations for FSIQ, we observe near-zero or negative proband-family correlations for several SRS-2 traits in XYY syndrome (0.088 > r > -0.26). This finding is surprising given that previous reports have found strong and positive proband-family correlations for social impairment in 16p11.2.2 deletion syndrome , and in the general population (e.g. parent-child r~0.35 ). One potential explanation for these findings is a differential rater effect across studies or syndromes. For example, variation in parental social functioning may influence how parents rate the social behavior of affected offspring (e.g. parents with greater social awareness are more sensitive to social impairments), and these sorts of effects may vary by the nature of social dysfunction seen in different neurogenetic disorders. Another potential explanation is that the uncoupling of child from parent SRS-2 scores is driven in part by aspects of the underlying biology of XYY syndrome, such that carriage of an extra Y-chromosome introduces independent sources of variance in social functioning that disrupt or overwhelm the influence of familial factors. For example, given that psychopathology in XYY syndrome has been shown to affect the specificity of the SRS-2 screening form , the specificity of SRS-2 ratings may differ between genetic disorders with dissimilar profiles of psychopathology – thereby leading to differing proband-parent correlations in SRS-2 scores. The variability of the relationship between family and XYY probands across different traits, and the differences between findings in XYY and gene dosage disorders like 16p11.2 deletion syndrome, suggests that the accuracy of family data in predicting proband outcomes is likely to be both trait- and disorder-specific. Taken together, our findings suggest that knowing family scores for certain traits, such as FSIQ and vocabulary scores, is useful for predicting proband trait variation across aneuploidy and CNV disorders, but that other traits, such as social impairment, can show highly variable proband-family correlations in different aneuploidy and CNV disorders.
Our findings also emphasize the added value of modeling proband-family trait interrelationships within a regression framework [4,10,36]. Specifically, by using regression to estimate proband offsets relative to family members versus the general population, we derived more refined estimates of the penetrance of XYY syndrome. This refinement is achieved because, (i) recruitment biases may enrich probands for particular background genetic and environmental factors that influence traits that are also impacted by XYY syndrome, and (ii) family-based offsets estimate the penetrance of XYY syndrome while controlling for the background genetic and environmental factors that probands share with their family members. For some traits, such as IQ and ADHD, our family-based models estimated offsets that differ from prior studies where trait scores in XYY syndrome were compared to those from recruited controls or standardized instrument distributions [33,39]. We observed the largest offsets for ADHD-related traits, followed by ASD-related traits and FSIQ/vocabulary scores.
In addition to allowing offset estimation, regression approaches also provide a quantitative framework for estimation of proband scores given known family scores. However, the effect sizes for these slopes were considerably less than those for proband offsets, and given the small sample size of our cohort, results from these regression slope analyses must be considered provisional in nature. By analyzing these slopes, our study reveals that the average magnitude of FSIQ reduction in XYY probands relative to their first-degree relatives is expected to be the same (~ 1.3 Cohen’s d effect size) across different levels of proband and family IQ. This finding indicates that any potential ascertainment biases that might enrich for recruitment of families with unusually high or low FSIQs should not bias estimation of the penetrance of XYY for FSIQ reduction (although may still bias estimation for other neurobehavioral measures).
Regression analysis also provides a framework for multivariate modeling of proband outcomes from family data, which we harnessed to test for potential moderating effects of family FSIQ and perinatal variables on other proband-family trait interrelationships. These analyses revealed that there is a negative association between family IQ and proband SRS-2 scores: a higher family IQ score provides a protective effect for some aspects of social responsiveness. This finding demonstrates the need to examine cross-measure, family-to-proband relationships to better predict proband outcomes in the future. Multiple linear regression models also suggested that variability in XYY proband outcomes is not significantly related to variability in familial SES and perinatal variables. Further expansion of such multiple linear regression methods would help to more systematically determine the limits of our capacity to predict variation in proband outcomes from family phenotypic data.
Finally, our framework applies clustering analysis to resolve the effect of an additional Y-chromosome on coherence within and between family members across several cognitive and behavioral measures. In this context, clustering provides an analytically efficient means of describing the architecture of trait variation within families. Observing cluster composition can indicate the extent to which clustering of traits within a family is governed by carrier status (e.g. clusters separate carriers from others regardless of trait ) as compared to the phenotype being considered (e.g. clusters separate IQ from other traits regardless of the family member being measured). In application to XYY probands and their family members, we observe co-clustering of traits into 4 broad groups (i) Family IQ and Proband ADHD-traits, (ii) Parent Psychopathology and Sibling ADHD-traits, (iii) Sibling Psychopathology, and (iv) Proband Psychopathology. All of the SRS-2 measures for a given family member cluster together, which is expected due to the high internal consistency of the SRS-2 . The clustering of ADHD-traits in parents and unaffected siblings was also found in previous reports of children with ADHD . Notably, proband ADHD-traits clustered with family IQ measures, suggesting that the sources of ADHD-trait variation may differ between unaffected siblings and the XYY probands. While overall clustering patterns suggest a greater coherence between parent and sibling psychopathology compared to parent and proband psychopathology, all of the family IQ measures cluster together, suggesting that carriage of an extra Y-chromosome may disrupt coherence between probands and family members for psychopathology, but not for IQ measures. We anticipate that similar applications of clustering to family-based phenotypic data may help to reveal traits in relatives that are most closely coupled to trait variation in carriers, and to specify the effects of applying different dimension reduction techniques to familial phenotypic data.
Our findings should be considered in light of several limitations and caveats. First, because the families in our cohort were not identified through a population-based sampling frame, they may not be fully representative of the full range of outcomes and background factors seen in XYY-probands and their first-degree relatives. Second, our study is cross-sectional in design and therefore cannot resolve potentially age-varying proband-family interrelationships. Third, observed correlations between rating scale scores can be influenced by methodological aspects which our study design does not directly model, such as parent versus child , or mother versus father  rater effects. Relatedly, our study design is not able to disambiguate the many potential sources of observed correlations between parent and child traits, which could include highly contrasting mechanisms - i.e., shared genetic determinants of IQ variation between probands and parents versus high parental psychopathology being driven by high caregiver strain, which in turn relates to proband psychopathology. In a similar vein, the variable contribution of unaffected relatives to estimated trait scores across different families means that the observed trait correlations between probands and families reflects a composite of different degrees of genetic relatedness. However, our findings from naturalistic estimation of trait correspondence between probands and their family members provide a valuable reference-point when contemplating how such approaches might be used in practice to improve prediction of proband outcomes from family data in practice. Finally, our modest sample size necessarily places limits on our power to confidently detect small effect size phenomena – such as those modelled by the slope terms in family-proband regression analyses. Future analyses in larger cohorts will help to address this limitation and also to directly test the reproducibility of our findings.