We used an MRI dataset from participants recruited through the UC Davis MIND Institute between 2009 and 2011. The recruitment process as well as clinical and imaging procedures have been described in detail previously (16).
Participants
Between 2009 and 2011, participants from a clinical longitudinal cohort (29) were asked to also take part in an MRI acquisition protocol. Invitation was made through phone screening and led to the recruitment of 64 participants. For these longitudinal analyses, we first selected the 41 participants who were categorized as HR (13 females). In this study, HR was defined as having an older sibling with a confirmed diagnosis of ASD. Having a sample exclusively constituted of HR participants allows to study the continuum of symptom severity and the emergence of ASD amongst participants who share a similar risk to develop the disorder (10). For our analysis, we included only the 22 HR participants who underwent 2 MRI scans (first at 12-15 months and second at 18-24 months of age). From this longitudinal sample, 2 children were excluded because the image quality was too bad to be processed by the FreeSurfer automated pipeline described below. This resulted in a final sample of 20 HR children (5 females).
Demographic characteristics of the sample are displayed in Table 1. We divided our sample into three groups according to their diagnostic outcome at age 3: HR with typical development (HR-TD, n=8, 3 females), HR with ASD (HR-ASD, n=9, 1 female) and HR with atypical development (HR-non-TD, n=3, 1 female). One HR-TD child did not undergo the 18-month clinical assessment but was not excluded from the study. We further separated HR-ASD participants into two subgroups according to age of first established diagnosis. HR-ASD who were diagnosed at 18 months or before were classified as early onset autism (EOA; n=4, 1 female). HR-ASD participants whose diagnosis was established later than 18 months of age were labelled as later onset autism (LOA; n=5, 0 female).
One could notice that the proportion of ASD in our HR sample (45%) is greater than the prevalence of ~20% which is reported in the literature (10). Nevertheless, one must take into account the fact that our population is constituted by a majority of male HR in which the prevalence of ASD has been reported to be around 32% (30). Another explanation could rely in the fact that parents who were more worried about their child’s development were more motivated to participate to the scan acquisition, thus leading to a recruitment bias.
Behavioral measures and outcome classification
Clinical assessments were conducted with each participant at 6, 12, 18, 24 and 36 months.
The Mullen Scales of Early Learning (MSEL) was used to assess development in cognitive (expressive and receptive language, visual reception) and motor (fine and gross) areas (31). Developmental quotient scores (DQ) were used instead of standard scores in order to limit truncation of very low performing participants (32). Individual DQs were obtained by dividing age-equivalent developmental age output from MSEL by chronological age and multiplying by 100.
ASD-related symptom severity was quantified with the Autism Diagnostic Observation Schedule (ADOS) (33). The ADOS is a semi-structured observational evaluation with cut-offs to guide diagnostic decisions, appropriate for ambulatory children of 12 months and older. Either module 1 (intended for non-verbal children or those using only isolated words) or module 2 (intended for children with phrase speech) was conducted at ages 18, 24 and 36 months. To allow comparison of ADOS total scores across ages and modules, the calibrated severity score (CSS) was used. ADOS CSS ranges from 1 to 10 (with 10 being the most severe) (34)(35).
At each visit from 18 months and later, ASD diagnosis outcome was established by a licensed clinician according to ADOS diagnosis cut-offs and DSM-IV criteria (36). Children who did not meet the criteria for a diagnosis of ASD were categorized as having typical development (TD) or non-typical development (non-TD). TD was defined as having an ADOS CSS equal to or less than 2, a total DQ of at least 85, no DQ subtest less than 80 and no more than one DQ subtest less than 85. If one or more of these criteria were not met, participants without ASD were classified as non-TD.
Image acquisitions
All children were scanned during natural sleep following previously published procedures (37), at the UC Davis Imaging Research Centre on a 3 Tesla Siemens TIM Trio MRI system with an eight-channel head coil. Structural T1-weighted 3D MP-RAGE images were acquired with 1 mm3 isometric voxels, repetition time=3200 ms, echo time=5.08 ms, field of view=176 mm and 192 sagittal slices. The success rate of these MRI acquisitions was 78%. A 3D image distortion map (Image Owl) was acquired at the end of each scan with a calibration phantom (Phantom Laboratory, Inc.). Distortion correction was carried out as described in (38).
Participants had a first MRI scan at 6-9 months of age which was not evaluated in the present analyses because of the difficulty to obtain accurate 3D white matter surface reconstructions at this age. Accordingly, we utilized the participant’s second scan, acquired between 12 and 15 months of age, and third scan, acquired between 18 and 24 months of age.
Image processing and quality control
We used the automated pipeline provided by FreeSurfer v6.0 to process the T1-weighted cerebral MRIs (http://surfer.nmr.mgh.harvard.edu/). The successive steps of this automated procedure are described in detail elsewhere (39)(40)(41)(42). Briefly, non-cerebral tissues are removed, signal intensity is normalized, and the image is segmented using a connected components algorithm. Then, a single filled volume of white matter is generated for each hemisphere. For each volume of white matter, a triangular surface tessellation is created by fitting a deformable template. Through deformation of this tessellated surface, a cortical mesh is created that defines the boundary between white and cortical gray matter (called the outer white matter surface) as well as the boundary between the gray matter and the extra-axial fluid (called the pial surface). This surface deformation process is calculated through an energy minimization function that determines the sharpest shift in intensity between voxels to define the transition between tissue categories. This process is independent of absolute intensity values and can delineate boundaries at a subvoxel resolution.
A trained operator (M.G.), blind to any clinical outcome, visually inspected images obtained with the described automated pipeline. First, he attributed a subjective score going from one to ten to every image relating to the level of motion artifact. For every participant, the average between the scores of the two scans was computed. There was no significant association between this averaged score and any of the primary clinical outcome described below. Second, he implemented manual corrections when required following recommended procedures described in the FreeSurfer manual (http://freesurfer.net/fswiki/). All final cortical surfaces were visually validated by a second trained independent operator (M.S.) who was also blind to all clinical outcomes.
Gray-white matter intensity contrast
We first sampled white matter intensity at each vertex v at 1mm beneath the white matter (WM) outer surface (Fig.1). A distance of 1mm was chosen to facilitate comparisons with previous literature since it is the most commonly used value in GWC studies, including all existing studies exploring GWC in ASD (25) (26). Gray matter (GM) intensity value was sampled at each vertex at a distance of 30% of cortex width (defined as the distance between outer white matter surface and pial surface) starting from the white matter outer surface. The value of 30% was set because it is the most commonly used in the GWC literature (25) and is the default value provided by FreeSurfer. In addition, a previous study of ASD found diagnostic differences to be greatest when GM intensity sampled between 30 and 40% was used to compute GWC (26). GWC at each vertex v was computed by dividing the difference between GM and WM intensities by the mean between GM and WM intensities and multiplying by 100 to get a ratio expressed in [%]. This was performed for each individual scan at time T.
GWC values were then registered on an average template provided by FreeSurfer to allow vertex-wise inter-participants comparison. During this process, GWC values were smoothed with a full-width at half-maximum (FWHM) surface-based Gaussian kernel of 10 mm.
Then, for each participant two different GWC longitudinal values were computed. Longitudinal neuroimaging pipelines allow many advantages over cross-sectional designs, including reduction of within-participant variability and the possibility to analyze the effect of time on the variable of interest (43). First, we estimated the individual average GWCv values between 12 and 24 months of age by computing the mean of GWCv values between the two scans.
Where GWCv1 is GWCv at 12-15 months and GWCv2 is GWCv at 18-24 month.
Second, we computed the individual GWC rate of change between two scans (ΔGWC) at each vertex. ΔGWC represents the effect of time on GWC between the age of 12 and 24 months. It was computed through the symmetrical percentage of change (SPC) formula (43). SPC consists of calculating for each participant at each vertex v the GWC difference between two scans divided by the age difference between the two scans, giving a rate in [%/month]. This rate is divided by mean GWC at each vertex v and multiplied by 100, giving a result expressed in [%]. One advantage of using SPC value to express rate of change is that it is symmetrical (i.e. not more dependent on values of one of the two scans). SPC expresses the rate at which GWC changes in each vertex between two scans relative to mean GWC.
Where age1 is participant’s age at 12-15 months scan and age2 is their age at 18-24 months scan.
Cortical thickness
Cortical thickness (CT) alterations have been found to influence GWC values (44). CT is defined by the distance in mm between WM outer surface and pial surface and is automatically computed by the standard FreeSurfer processing pipeline (42). To control for this possible confound, we sampled CT at each vertex v. We then computed the individual average CTv across the two scans and the CTv rate of change (ΔCTv) with the same formulas we described for longitudinal GWC parameters. Spatial overlap between significant effects on GWC and CT were explored.
Statistical analysis
Sample characteristics analyses
Our primary outcomes are symptom severity (ADOS CSS) at 18 and 36 months of age and discrete diagnosis at age 3 (HR-TD, HR-ASD, HR-non-TD). The mean of the age at scan acquisitions was 16.5 months (see Table 1). We thus considered the evaluations performed at 18 months as the clinical correlate that was the closest in time to the neuroimaging data and will be referred to as the phenotype at the time of scan acquisitions. The data coming from the assessments at 36 months of age were used to test for associations between early neuroimaging parameters and later clinical outcome.
We further subdivided the HR-ASD into late and early symptom onset (LOA and EOA). Associations between clinical outcome and individual characteristics that could represent potential confounding factors in GWC analysis were explored. These parameters included gender, age at scanning (which was calculated as mean age between two scans) and time interval between scans. According to the nature of the tested variables (i.e. discrete or continuous), we used either Pearson correlation or one-way ANOVA or Student T-test (or Mann-Whitney U test when non-parametric distribution of variables was found) or chi-square test.
For descriptive purposes, we tested for differences in behavioral scores (DQ, ADOS CSS at 18 and 24 months) between diagnosis groups (HR-TD, HR-ASD and HR-non-TD) using one-way ANOVA. We also tested for potential differences in behavioral scores between the two HR-ASD subgroups (EOA and LOA) using either Student T-test or Mann-Whitney U test. Statistics described in this section were performed with Prism® v.8.3.0 software with significance threshold set at alpha = 0.05.
Surface-based analyses
We used the general linear model (GLM) command implemented in FreeSurfer to perform vertex-wise whole-brain surface-based analysis of GWC.
First, to determine the effect of time on GWC in typical development between the age of 12 and 24 months, we performed vertex-wise parametric comparison of ΔGWC values versus zero in our HR-TD group. We then extracted vertex-wise ΔGWC values from all significant clusters and computed an average ΔGWC value for each hemisphere. This mean hemispheric ΔGWC value was compared between right and left to test for any asymmetry in the effect of age on GWC.
Then, we fit a GLM to test whether GWC and ΔGWC at age 12-24 months are associated with discrete diagnostic outcome at age 3 (HR-ASD-or HR-TD):
We further tested if symptom severity at 18 and at 36 months of age were associated with GWC and ΔGWC. The following GLM was conducted:
Given that GWC before the age of 24 months is influenced by age (28), age at scanning (calculated for each participant as the mean age between two scans) was regressed out in all GLM analyses. The p-value for each voxel was calculated using two-tailed testing with significance threshold set at alpha = 0.05. Cluster-wise analyses were corrected for multiple comparisons using Monte-Carlo simulation (MCS) with a significance threshold for cluster-wise p-value (CWP) of alpha = 0.05. We used cluster-wise and MCS analysis pipelines implemented in FreeSurfer (45).
We then wanted to determine if potential alterations of GWC and ΔGWC found with GLM analysis were associated with the age at which the first ASD-related symptoms emerged. As a post-hoc exploratory analysis, for each cluster exhibiting significant GWC or ΔGWC alterations, we computed the average of all vertex-wise GWC or ΔGWC values respectively across all vertices in the cluster for each participant. These individual cluster-averaged GWC values were then compared between HR-TD participants and each of the HR-ASD subgroups (EOA and LOA) using Student T-test or Mann-Whitney U. Significance threshold was set at alpha = 0.05.
The same GLM methods described here were also utilized for analyses of CT and ΔCT values.