This study is a secondary analysis of pooled data from projects aimed at identifying an autism biomarker, training social skills in adolescents with autism and developing an early autism screening instrument based on clinical diagnostic evaluations conducted at (Redacted for Review) from 2008 to 2017. Participants were recruited via patient referrals from psychiatric clinics and the Pediatrics and Child Rehabilitation clinic at (Redacted for Review), local clinics and daycare centers, recruitment posters on online/offline bulletin boards of public institutions, and online parenting communities. Participants in each project consented to the retrospective use of their collected data. A total of 2,177 participants (mean age [standard deviation] = 79.2 [64.0] months; age range = 9–393 months; 1600 males; 1480 participants with autism and 697 participants without autism) were included in this study (Module-T, n = 296; Module-1, n = 654; Module-2 n = 574; Module-3 n = 411; Module-4, n = 233). Further, participants who were diagnosed as not having autism based on clinical best-estimate diagnosis and obtained scores lower than 80 in either the full-scale intelligence quotients (FSIQ) or Korean Vineland Adaptive Behavior Scales, Second Edition (K-VABS; ) were categorized as OD. Diagnostic procedures are presented in the Procedures section below. Detailed characteristics of the total participants and participants by module are included in Tables 1 and 2. Information on participant characteristics for each developmental cell of the Toddler Module, Module 1, and Module 2 are available in Supplementary Table S1. Detailed characteristics of the OD participants are available in Supplementary Table S2.
[INSERT TABLE 1 and 2 HERE]
Participants and their parents completed a battery of tests during their one-time visit, including the K-ADOS or K-ADOS-2, ADI-R, the Korean version of Childhood Autism Rating Scale (K-CARS), K-SMS, and cognitive tests measuring FSIQs. Questionnaires, such as the Social Responsiveness Scale-2 (SRS-2), Social Communication Questionnaire (SCQ), the Korean version of Childhood Autism Rating Scale (K-CARS) and K-VABS, were mailed and filled out prior to the visit. The K-ADOS or K-ADOS-2 and ADI-R were administered by research-reliable professionals or research assistants who worked alongside them in the same lab on a daily basis and were trained prior to the actual administration. The scales were administered only after an adequate level of inter-reliability with the research-reliable professionals (>80%) was reached. All administrations of the K-ADOS or K-ADOS-2 and ADI-R were video-taped and double-checked by these professionals to confirm the quality and reliability.
Subsequently, two board-certified psychiatrists made the best-estimate clinical diagnostic criteria for autism (including pervasive developmental disorder and Asperger’s syndrome) and non-autism based on DSM-IV TM  and DSM-V . The clinical best-estimate diagnosis was made according to the information gathered collectively from all tests administered, including the K-ADOS/K-ADOS-2, ADI-R, SCQ, SRS-2, K-CARS, SMS, VABS, and IQ assessments, as well as observed clinical impressions. The study was approved by the Institutional Review Board (IRB) of (Redacted for Review) (IRB no. B-2110-716-102).
Autism Diagnostic Observation Schedule and Autism Diagnostic Observation Schedule-2 (ADOS and ADOS-2; [4, 5])
This study used the Korean translated versions of the ADOS/ADOS-2, approved by its publisher Western Psychological Services. Data collected prior to July 2017, when the ADOS-2 was published in Korea, was administered using the original K-ADOS. The results from the K-ADOS were rescored based on the K-ADOS-2 algorithm for this study. All modules provide two cut-off points in the diagnostic algorithms. For Modules 1 through 4, there is a higher cut-off in the diagnostic algorithms for stringent classification (i.e., autism) and a lower cut-off in the diagnostic algorithms for more inclusive classification (that is, autism spectrum disorder; ASD). For Module 4, we applied the revised algorithm and cut-off points from Hus and Lord .
Similarly, the Toddler Module has a higher cut-off in the diagnostic algorithms for stringent classification (moderate-severe concern) and a lower cut-off in the diagnostic algorithms for more inclusive classification (mild-moderate concern). Meeting the ASD cut-off in Modules1 through 4 and mild-moderate concern in the Toddler Module were categorized as autism in the K-ADOS-2 diagnosis.
Autism Diagnostic Interview-Revised (ADI-R; )
The ADI-R is a semi-structured caregivers’ interview used to diagnose or evaluate the core symptoms of autism. Each item is scored and converted on a scale of 0, 1, and 2, with higher scores indicating more severe autism-related symptoms. ADI-R includes 93 items describing four diagnostic domains: social interaction, communication, RRBs, and abnormality of development evident at or before 36 months. Each domain has a diagnostic criterion, but individuals must exceed all four cut-off scores to be classified as autism. While the majority of the algorithm score consists of parents’ descriptions of a child’s behaviors between the ages of 4–5 years, some items ask whether the behavior has ever been present during the child's lifetime. For children under 4 years of age, ratings on current behaviors are used. The Korean translation of the ADI-R , approved by its publisher Western Psychological Services, was used in this study.
Social Communication Questionnaire 
The SCQ is a caregiver-report screening instrument for autism designed to evaluate an individual’s behavior in three domains: social interaction, language and communication, and RRB. SCQ includes 40 items to be rated as either “yes” or “no.” It consists of two forms: the Lifetime Form, which focuses on an individual’s developmental history, and the Current Form, which inspects an individual’s behaviors over the past three months. The total score in the Lifetime Form is used to determine if an individual is likely to have autism, and whether a more extended diagnostic evaluation needs to be undertaken. In this study, we used a cutoff score of 10, for children under 47 months of age, and 12, for children over 48 months, based on a standardization study conducted in Korea .
Social Responsiveness Scale-2 (SRS-2; )
The SRS-2 is a 65-item parent report questionnaire that assesses the severity of autism-related symptoms on a 4-point scale, with higher total scores reflecting more severe autism symptomatology. It consists of five subscales: social awareness, social cognition, social communication, social motivation, and autistic mannerisms. The SRS-2 has been used extensively in the autism literature as a diagnostic measure  and is reported to have good internal consistency and concurrent, discriminant validity . Chun et al.  demonstrated adequate levels of sensitivity and specificity of the Korean translated version of the SRS-2. A cut-off T-score of 65 was applied regardless of gender in the preschool form of the SRS-2, and cut-off T-scores of 70 and 63 were used for female and male participants, respectively, for the school-age and adults SRS-2 versions because these values are widely used across clinical settings in South Korea.
Korean version of the Childhood Autism Rating Scale (K-CARS; )
The CARS  is a clinician-rated scale developed to screen for autism. Consisting of 15 items rating the presence and severity of symptoms associated with autism, the CARS is scored from 1 (no impairment observed or reported) to 4 (severe impairment). There is no consensus on the cut-off score of the K-CARS; Shin and Kim  suggested a cut-off score of 28, while others recommend 24 . Therefore, we utilized both cut-off scores in this study.
Full Scale Intelligence Quotients (FSIQ)
The following instruments were used to calculate FSIQ in this study: the Wechsler Preschool and Primary Scale of Intelligence (WPPSI)  for children aged 2 years and 6 months to 6 years, Wechsler Intelligence Scale for Children (WISC)  for children aged 6–16 years, and Wechsler Adult Intelligence Scale (WAIS)  for individuals over 16 years of age. These instruments utilize chronological age standardization with a mean of 100 and a standard deviation of 15.
Korean version of the Vineland Adaptive Behavior Scale, second edition (K-VABS; [31, 47])
The VABS is a parent or other caregiver’s ratings of a person’s adaptive functioning and social self-sufficiency from birth to adulthood. The VABS consists of five domains: communication, daily living skills, socialization, motor skills, and maladaptive behavior. It is scored on a 0–2 rating scale, with a higher score representing skills used more frequently. The five domains together yield a total adaptive behavior composite score. The normative mean of the composite score is 100, with standard deviation of 15.
Korean Vineland Social Maturity Scale (K-SMS; )
The K-SMS is a clinician-rated instrument that assesses social and adaptive maturity. Originally developed using the Doll’s Vineland Social Maturity Scale , the K-SMS includes 89 items grouped by behavioral milestones that are expected at each age. It consists of eight subdomains (communication, general self-help, locomotion, occupation, self-direction, self-help eating, self-help dressing, and socialization skills) and provides a global social age and social quotient.
Initially, the scores from the K-ADOS-2, ADI-R, K-CARS, SCQ, SRS-2, mean age, and FSIQ of participants with autism and those without autism were compared using a set of independent sample t-tests. Calibrated severity scores were used to compare the K-ADOS-2 scores.
To address the first aim, the sensitivity, specificity, PPV, NPV, and Cohen’s kappa (k) were calculated to check for consistency between the best-estimate clinical diagnosis and diagnosis based on ASD cut-off in K-ADOS-2 Modules 1–4 and mild-moderate concern in Toddler Module. This analysis was conducted on all modules combined, each module (including Module-T, 1, 2, 3, and 4) individually, and each developmental cell (12-20/NV21-30 and 21-30SW in Toddler Module, NW and SW in Module 1, and under and over 5 years of age in Module 2). We also computed the area under the receiver operating characteristic (ROC) curve of all items by developmental cell to explore if all items included in the algorithm have sufficient diagnostic accuracy according to the area under the curve (AUC).
To investigate the second aim, we computed Pearson’s r correlation coefficients between the total scores of K-ADOS-2 and those of existing autism diagnostic instruments (i.e., ADI-R, K-CARS, SCQ, and SRS-2) for all modules combined, each module individually, and each developmental cell. Additionally, k values were calculated between the diagnosis based on the K-ADOS-2 ASD cutoff and the diagnosis based on the existing autism diagnostic instruments. The k values were interpreted based on McHugh’s  criteria (0–.2, none; .21–.39, minimal; .40–.59, weak; .60–.79, moderate; .8–.9, strong; above .9, almost perfect). For the third aim, Cronbach’s α values for the algorithm items and values after an item was removed were computed to examine the internal consistency of each developmental cell and module.
Finally, we calculated the sensitivity, specificity, PPV, NPV, and k values to examine how accurately the K-ADOS-2 ASD cut-off can distinguish autism from OD for all modules combined, each module individually, and each developmental cell. We did not compare the diagnostic validity between OD and the remaining participants without autism (i.e., participants who were not diagnosed as autism and did not have FSIQ or VABS scores lower than 80) because this sample included a few participants for whom we did not have all FSIQ and VABS scores and therefore would have been categorized as OD if all relevant information was available.
All analyses except for the calculation of Cronbach’s α values were repeated using the Autism cut-off in Modules 1–4 and moderate-severe concern in the Toddler Module, but we primarily relied on the results from the ASD cut-off to make the decisions regarding validity. All statistical analyses were performed using Excel and SPSS Statistics (version 23.0; IBM Corp., Armonk, NY, USA).