We leveraged a MRI dataset involving participants recruited through the UC Davis MIND Institute between 2009 and 2011. The recruitment process as well as clinical and imaging procedures have been described in detail previously (16).
Participants
Between 2009 and 2011, participants from a clinical longitudinal cohort (29) were asked to also take part in an MRI acquisition protocol. Invitation was made through phone screening and led to the recruitment of 64 participants. For the current longitudinal analyses, we first selected the 41 participants who were categorized as HR (13 females). In this study, HR was defined as having an older sibling with a confirmed diagnosis of ASD. Having a sample exclusively constituted of HR participants allows to study the continuum of symptom severity and the emergence of ASD amongst participants who share a similar risk to develop the disorder (10). For our analysis, we included only the HR participants who underwent 2 MRI scans (first at 12-15 months and second at 18-24 months of age). From the initial 41 HR participants, 6 were excluded (14.6% from initial sample) due to failure of one scan acquisition, 1 (2.4%) due to failure of both scan acquisitions, 4 (9.8%) for not coming to one MRI session, 7 (17.1%) due to dropping out from the study, 1 (2.4%) due to lost data and 2 (4.9%) because of poor image quality (described below). From the overall 61 attempted scans, there were 9 failures resulting in a success rate of 85.2%. This resulted in a final sample of 20 HR children (5 females).
Demographic characteristics of the sample are displayed in Table 1. We divided our sample into three groups according to their diagnostic outcome at age 3: HR with typical development (HR-TD, n=8, 3 females), HR with ASD (HR-ASD, n=9, 1 female) and HR with atypical development (HR-non-TD, n=3, 1 female). One HR-TD child did not undergo the 18-month clinical assessment but was not excluded from the study. As an additional exploratory analysis, we further separated HR-ASD participants into two subgroups according to age of first established diagnosis. HR-ASD who were diagnosed at 18 months or before were classified as early onset autism (EOA; n=4, 1 female). HR-ASD participants whose diagnosis was established later than 18 months of age were labelled as later onset autism (LOA; n=5, 0 female). EOA and LOA sample characteristics are detailed in supplementary table.
One could notice that the proportion of ASD in our HR sample (45%) is greater than the prevalence of ~20% which is reported in the literature (10). Nevertheless, one must take into account the fact that our population is constituted by a majority of male HR in which the prevalence of ASD has been reported to be around 32% (30). Another explanation could rely in the fact that parents who were more worried about their child’s development were more motivated to participate to the scan acquisition, thus leading to a recruitment bias.
It should be noted that HR-TD do not share similar developmental trajectories with low-risk children with a typical development (LR-TD) as HR-TD exhibit higher ASD traits and increased risk for other conditions such as anxiety and attention-deficit/hyperactivity disorders (31,32). As a supplementary analysis, we included the 10 LR-TD (2 females) from the cohort who completed two scans at 12-15 months and 18-24 months for comparison with HR-TD and HR-ASD. The LR-TD mean age between two scans was 17.4 ±1.9 months (15.4 - 20.1). Time interval between both scans was 7.1 ± 1.3 months (5.7 – 9.5).
Behavioral measures and outcome classification
Clinical assessments were conducted with each participant at 6, 12, 18, 24 and 36 months.
The Mullen Scales of Early Learning (MSEL) was used to assess development in cognitive (expressive and receptive language, visual reception) and motor (fine and gross) areas (33). Developmental quotient scores (DQ) were used instead of standard scores in order to limit truncation of very low performing participants (34). Individual DQs were obtained by dividing age-equivalent developmental age output from MSEL by chronological age and multiplying by 100.
ASD-related symptom severity was quantified with the Autism Diagnostic Observation Schedule (ADOS) (35). The ADOS is a semi-structured observational evaluation with cut-offs to guide diagnostic decisions, appropriate for ambulatory children of 12 months and older. Either module 1 (intended for non-verbal children or those using only isolated words) or module 2 (intended for children with phrase speech) was conducted at ages 18, 24 and 36 months. To allow comparison of ADOS total scores across ages and modules, the calibrated severity score (CSS) was used. ADOS CSS ranges from 1 to 10 (with 10 being the most severe) (36)(37).
At each visit from 18 months and later, ASD diagnosis outcome was established by a licensed clinician according to ADOS diagnosis cut-offs and DSM-IV criteria (38). Children who did not meet the criteria for a diagnosis of ASD were categorized as having typical development (TD) or non-typical development (non-TD). TD was defined as having an ADOS CSS equal to or less than 2, a total DQ of at least 85, no DQ subtest less than 80 and no more than one DQ subtest less than 85. If one or more of these criteria were not met, participants without ASD were classified as non-TD.
Image acquisitions
All children were scanned during natural sleep following previously published procedures (39), at the UC Davis Imaging Research Centre on a 3 Tesla Siemens TIM Trio MRI system with an eight-channel head coil. Structural T1-weighted 3D MP-RAGE images were acquired with 1 mm3 isometric voxels, repetition time=3200 ms, echo time=5.08 ms, field of view=176 mm and 192 sagittal slices. The success rate of these MRI acquisitions was 78%. A 3D image distortion map (Image Owl) was acquired at the end of each scan with a calibration phantom (Phantom Laboratory, Inc.). Distortion correction was carried out as described in (40).
Participants had a first MRI scan at 6-9 months of age which was not evaluated in the present analyses because of the difficulty to obtain accurate 3D white matter surface reconstructions at this age. Accordingly, we utilized the participant’s second scan, acquired between 12 and 15 months of age, and third scan, acquired between 18 and 24 months of age. This third scan was acquired an average of 1.5 ± 2.0 months after the 18-months ADOS.
Image processing and quality control
We used the automated pipeline provided by FreeSurfer v6.0 to process the T1-weighted cerebral MRIs (http://surfer.nmr.mgh.harvard.edu/). The successive steps of this automated procedure are described in detail elsewhere (41)(42)(43)(44). Briefly, non-cerebral tissues are removed, signal intensity is normalized, and the image is segmented using a connected components algorithm. Then, a single filled volume of white matter is generated for each hemisphere. For each volume of white matter, a triangular surface tessellation is created by fitting a deformable template. Through deformation of this tessellated surface, a cortical mesh is created that defines the boundary between white and cortical gray matter (called the outer white matter surface) as well as the boundary between the gray matter and the extra-axial fluid (called the pial surface). This surface deformation process is calculated through an energy minimization function that determines the sharpest shift in intensity between voxels to define the transition between tissue categories. This process is independent of absolute intensity values and can delineate boundaries at a subvoxel resolution. The described pipeline has previously been applied within toddlers (28) as well as children with ASD (25). Importantly, FreeSurfer delineation of white matter and gray matter is solely based on intensity shift and does not rely on an age specific template.
A trained operator (M.G.), blind to any clinical outcome, visually inspected images obtained with the described automated pipeline. First, they attributed a subjective score ranging from one to ten for every image denoting the level of motion artifact. A motion rating (MR) was then estimated for every participant by averaging these scores between their two scans. Second, they implemented manual corrections when required following recommended procedures described in the FreeSurfer manual (http://freesurfer.net/fswiki/). All final cortical surfaces were visually validated by a second trained independent operator (M.S.) who was also blind to all clinical outcomes.
Gray-white matter intensity contrast
We first sampled white matter intensity at each vertex v at 1mm beneath the white matter (WM) outer surface (Fig.1). A distance of 1mm was chosen to facilitate comparisons with previous literature since it is the most commonly used value in GWC studies, including all existing studies exploring GWC in ASD (25) (26). Gray matter (GM) intensity value was sampled at each vertex at a distance of 30% of cortex width (defined as the distance between outer white matter surface and pial surface) starting from the white matter outer surface. The value of 30% was set because it is the most commonly used in the GWC literature (25) and is the default value provided by FreeSurfer. In addition, a previous study of ASD found diagnostic differences to be greatest when GM intensity sampled between 30 and 40% was used to compute GWC (26). GWC at each vertex v was computed by dividing the difference between GM and WM intensities by the mean between GM and WM intensities and multiplying by 100 to get a ratio expressed in [%]. This was performed for each individual scan at time T.
GWC values were then registered on an average template provided by FreeSurfer to allow vertex-wise inter-participants comparison. During this process, GWC values were smoothed with a full-width at half-maximum (FWHM) surface-based Gaussian kernel of 10 mm.
Then, for each participant two different GWC longitudinal values were computed. Longitudinal neuroimaging designs allow many advantages over cross-sectional designs, including reduction of within-participant variability and the possibility to analyze the effect of time on the variable of interest (45). First, we estimated the individual average GWCv values between 12 and 24 months of age by computing the mean of GWCv values between the two scans.
Where GWCv1 is GWCv at 12-15 months and GWCv2 is GWCv at 18-24 month.
Second, we computed the individual GWC rate of change between two scans (ΔGWC) at each vertex. ΔGWC represents the effect of time on GWC between the age of 12 and 24 months and was computed through the symmetrical percentage of change (SPC) formula (45). SPC consists of calculating, for each participant at each vertex v, the GWC difference between two scans divided by the age difference between the two scans, giving a rate in [%/month]. This rate is divided by mean GWC at each vertex v and multiplied by 100, giving a result expressed in [%]. Using symmetrized measures of change (such as ) over absolute differences (such as that would not be scaled to the mean) is recommended in longitudinal analyses as they allow many advantages such as increased statistical robustness, higher reliability in small samples and balanced consideration of both measures and (46). SPC expresses the rate at which GWC changes in each vertex between two scans relative to mean GWC.
Where age1 is participant’s age at 12-15 months scan and age2 is their age at 18-24 months scan.
Cortical thickness
Cortical thickness (CT) alterations have been found to influence GWC values (47). CT is defined by the distance in mm between WM outer surface and pial surface and is automatically computed by the standard FreeSurfer processing pipeline (44). To control for this possible confound, we sampled CT at each vertex v. We then computed the individual average CTv across the two scans and the CTv rate of change (ΔCTv) with the same formulas we described for longitudinal GWC parameters. Spatial overlap between significant effects on GWC and CT were explored.
Motion Rating
Motion artifacts are a significant confound can influence GWC (26). While we found no difference in motion rating (MR) across main outcome groups (HR-TD, HR-ASD and HR-non-TD, we also performed vertex-wise analyses to explore potential local correlations between MR and GWC. We found one cluster in the right parietal superior region with positive correlation between MR and GWC (CWP = .026, cluster size of 848.6mm2), no significant results of interest where observed within this identified region.
Statistical analysis
Sample characteristics analyses
Our primary outcomes are symptom severity (ADOS CSS) at 18 and 36 months of age and discrete diagnosis at age 3 (HR-TD, HR-ASD, HR-non-TD). Mean age at scan acquisitions was 16.5 months (see Table 1). Accordingly we considered the evaluations performed at 18 months as the clinical correlate that was the closest in time to the neuroimaging data and thus refer to this as the phenotype at the time of scan acquisitions. Data derived from assessments at 36 months of age were used to test for associations between early neuroimaging parameters and later clinical outcomes.
We further subdivided the HR-ASD into late and early symptom onset (LOA and EOA). Associations between clinical outcome (HR-TD, HR-ASD and HR-non_TD) and individual characteristics that could represent potential confounding factors in GWC analysis were explored. These parameters included gender, age at scanning (which was calculated as mean age between two scans) and time interval between scans. According to the nature of the tested variables (i.e. discrete or continuous), we used either Pearson correlation, one-way ANOVA, Student T-test (or Mann-Whitney U test when non-parametric distribution of variables was found), or chi-square test.
For descriptive purposes, we tested for differences in behavioral scores (DQ, ADOS CSS at 18 and 24 months) between diagnosis groups (HR-TD, HR-ASD and HR-non-TD) using one-way ANOVA. We also tested for potential differences in behavioral scores between the two HR-ASD subgroups (EOA and LOA) using either Student T-test or Mann-Whitney U test. Statistics described in this section were performed with Prism® v.8.3.0 software with significance threshold set at alpha = 0.05. Behavioral and demographic characteristics of the sample are displayed in Table 1.
Surface-based analyses
We used the general linear model (GLM) command implemented in FreeSurfer to perform vertex-wise whole-brain surface-based analysis of GWC.
First, to determine the effect of time on GWC in typical development between the age of 12 and 24 months, we performed vertex-wise parametric comparison of ΔGWC values versus zero in our HR-TD group. We then extracted vertex-wise ΔGWC values from all significant clusters and computed an average ΔGWC value for each hemisphere. This mean hemispheric ΔGWC value was compared between right and left to test for any asymmetry in the effect of age on GWC.
Then, we fit a GLM to test whether GWC and ΔGWC at age 12-24 months are associated with discrete diagnostic outcome at age 3 (HR-ASD or HR-TD). HR-non-TD were not included in this analysis in order to limit the number of group comparisons. Also, HR-non-TD does not represent a clearly distinct group with pure developmental delay but also includes children with autistic traits without a confirmed diagnosis. As such, HR-non-TD can be considered as an intermediate group in the symptom continuum between TD and ASD.
We further tested if symptom severity at 18 and at 36 months of age were associated with GWC and ΔGWC. All participants were included in this analysis. The following GLM was conducted:
Given that GWC before the age of 24 months is influenced by age (28), age at scanning (calculated for each participant as the mean age between two scans) was regressed out in all GLM analyses. The p-value for each voxel was calculated using two-tailed testing with significance threshold set at alpha = 0.05. Cluster-wise analyses were corrected for multiple comparisons using Monte-Carlo simulation (MCS) with a significance threshold for cluster-wise p-value (CWP) of alpha = 0.05. We used cluster-wise and MCS analysis pipelines implemented in FreeSurfer (48).
The same GLM methods were utilized for analyses of CT and ΔCT values.
We then wanted to determine if potential alterations of GWC and ΔGWC found with GLM analysis were associated with the age at which the first ASD-related symptoms emerged. As a further post-hoc exploratory analysis, for each cluster exhibiting significant GWC or ΔGWC alterations, we computed the average of all vertex-wise GWC or ΔGWC values respectively across all vertices in the cluster for each participant. These individual cluster-averaged GWC values were then compared between HR-TD, EOA and LOA using an ANCOVA with age at first scan and age at second scan as regressors. Significance threshold was set at alpha = 0.05. Whenever an ANCOVA reached significance, post-hoc comparisons with multiple T-tests applying Bonferroni correction was performed.