Human study design
Procedural standards for participant recruitment
Subjects with diagnosed ASD according to the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5), as well as TD peers, were continuously enrolled through ARK Autism & Rehabilitation Institute, Shanxi, China. The severity of ASD was evaluated as per the Childhood Autism Rating Scale (CARS) by two experienced pediatricians. Information on all participants, including age, gender, dietary habit, manner of birth, administration of drug/healthcare products and common pathological conditions (neurologic, psychiatric and gastrointestinal diseases), were gathered via questionnaire. Next, in order to control the associated biomedical and environmental confounders that may affect the subsequent analyses, exclusion criteria were set up as follows: 1) on unbalanced diets (completely refusing or extremely favoring of specific staple foods); 2) diagnosed with severe ASD; 3) suffering from other neurologic or psychiatric disorders including epilepsy, schizophrenia, depression and attention-deficit/hyperactivity disorder (ADHD); 4) suffering from intestinal infectious diseases; 5) antibiotic or probiotic administration within a month before sampling. Finally, two cohorts, namely a main cohort (ASD vs. TD = 50 vs. 55) and an independent cohort (6 vs. 12), were successfully recruited for autistic GM profiling and validation.
Faecal sampling principles and specimen consistency
To ensure the sample quality, faecal sampling work was conducted by the guardians of the research participants. They were trained on the very basics of asepsis, recognition of stool consistency, and sampling operation. First, assisted by their guardians, participants were all required to defecate on a prepared cellulose core diaper for faecal separation from urine. Then, the stool consistency of each excreta was evaluated as per the Bristol Stool Chart and subsequently controlled by sampling only corn-on-cob-like (type 3) or sausage-like (type 4) faeces as normal stool. Next, to avoid any possible air-borne contamination, the surface part of the faeces was not harnessed. A sterile sampling scoop was used to withdraw the inner core of the stool. Finally, about 2 grams of quality specimen were collected in a sterile collecting tube. Samples were immediately refrigerated for 30 min, and then handled over to one of our staff for snap freezing and shipping by dry ice. For biological replicates, faecal sampling from each participant was performed two to three times with a minimum of 48-hour intervals. Each faecal sample was aliquoted and stored at -80 ℃ before being subjected to Ultra Performance Liquid Chromatography with Tandem Mass Spectrometry (UPLC-MS/MS), Gas Chromatography-Mass Spectrometry (GC-MS), 16S rRNA gene sequencing, metagenomics or quantitative PCR.
Sample grouping for multi-omics investigation
Each sampling of selected participants brought about a collection of specimens that was named as a sample set. To avoid analytic bias resulted from random events during sampling, such as emotional fluctuation, subtle changes of daily diets, and weather variations, different sample sets from the main cohort was exploited for multi-omics studies: the first sample set was gathered for analyzing metabolic and structural features of GM and screening distinct metabolites or taxon; the second and third sample sets were used for functional analysis of changes in autistic GM. The samples from the independent cohort were used for validation of key findings.
Targeted metabolomics
Measurements of neurotransmitters, BAs, and GABA metabolism-specific metabolites were subjected to the UPLC-MS/MS procedure, whilst SCFAs assessment subjected to GC-MS.
Sample preparation & extraction
To investigate the absolute abundance of different types of metabolites, faecal samples were prepared differently. For neurotransmitters, samples were prepared in pre-cooled acetonitrile with 1% (v/v) formic acid (FA); for BAs, samples were added to pre-cooled methanol; for specific metabolites in GABA metabolism, samples were added to a pre-cooled solution of acetonitrile and methanol in water (2:2:1, v/v/v). After vortexing, the obtained sample homogenate was incubated for 20 min at -20 ℃ to induce protein deposits, and then centrifuged at 14,000 g for 15 min at 4 °C. The supernatants were collected and dried under vacuum. Next, the above sample extracts were in turn redissolved in solutions of acetonitrile in water (1:1, v/v), methanol in water (1:1, v/v), and acetonitrile and methanol in water (2:2:1, v/v/v), and centrifugated as described above. The supernatants were collected and subjected to UPLC-MS/MS. Faecal samples for assessing SCFAs were prepared in 15% (v/v) phosphoric acid, and subjected to GC-MS.
UPLC-MS/MS procedure
For neurotransmitters measurement, an Agilent 1290 Infinity UPLC system (Agilent, USA) equipped with an Acquity UPLC BEH C18 column (1.7 µm × 2.1 mm × 100 mm, Waters, Canada) was used to analyze the samples. The samples were put in an auto-sampler (chromatographic column temperature, 45 °C; flow velocity, 300 μL/min). The mobile phase consists of 0.1% (v/v) ammonium formate (liquid A) and acetonitrile with 0.1% (v/v) FA (liquid B). A gradient-elution program was set as follows: starting from 90% B at 0 min, linear gradient was decreased to 40% B over 18 min; then eluent B was returned to 90% within 1 s and maintained for 5 min. GABA and glutamate (Glu) were used as the standards for chromatographic retention time correction. Subsequently, an electrospray ionization (ESI)-triple 5500 quadrupole-linear ion trap (QTRAP)-mass spectrometer (AB SCIEX, USA) was applied to conduct mass spectrometry (MS) analysis in positive ion mode (ESI+). The ESI+ source conditions were as follows: ion spray voltage floating (ISVF), 5000 V; ion source gas1 (Gas1), 60; ion source gas2 (Gas2), 60; curtain gas (CUR), 30; source temperature, 450 °C.
For bile acids measurement, a Waters Acquity UPLC I-Class system (Waters, USA) equipped with the Acquity UPLC BEH C18 column was used to analyze the samples. Samples were added with deuterated bile acids as internal standards (100 ppm; Thermo Fisher Scientific, USA) for chromatographic retention time correction, and put in an auto-sampler (chromatographic column temperature, 45 °C; flow velocity, 300 μL/min). The mobile phases were pure water with 0.1% (v/v) FA (liquid A) and methanol (liquid B). A gradient-elution program was set as follows: starting from 60% B at 0 min, linear gradient was increased to 65% B over 6 min and further to 80% B within 5 s; then, eluent B was returned to 90% within 1 s and maintained for 9 min. Subsequently, the 5500 QTRAP-mass spectrometer was applied to conduct MS analysis in negative ion mode (ESI-). The ESI- source conditions were as follows: ISVF, -4500 V; Gas1, 55; Gas2, 55; CUR, 40; source temperature, 550 °C.
To measure GABA metabolism-specific metabolites, the Waters UPLC system was equipped with the same sampler using the same column conditions as described above for bile acids measurement. Samples were added with a mixture of relevant compounds as internal standards (100 ppm; Shanghai Applied Protein Technology, China) for chromatographic retention time correction. The mobile phases were pure water with 1.2% (v/v) ammonium (liquid A) and methanol with 0.2% (v/v) FA (liquid B). A gradient-elution program was set as follows: starting from 75% B at 0 min, linear gradient was first decreased to 62% B over 10 min and further to 40% B over next 5 min; then, eluent B was returned to 75% within 30 s and held for 17 min. Subsequently, the 5500 QTRAP-mass spectrometer was applied to conduct MS analysis in both ESI+ and ESI- as described above.
The multiple reaction monitoring (MRM) was used for acquisition, detection and quantification of the metabolites in this study. Multi-Quant software (version 3.0.2) was used to extract and correct the extract the peak area and retention time of the chromatogram, and the relative content of the corresponding metabolite was represented as the area of each peak.
GC-MS procedure
For SCFAs, Agilent 6890N/5975B GC-MS spectrometer (Agilent, USA) was applied. Samples were added with 4-methylvaleric acid as internal strandards (83 ppm; Thermo Fisher Scientific), and put in an automatic sampler (carrier gas, helium; flow velocity, 1.0 mL/min; injection port temperature, 250 °C; split injection, split ratio 10:1; solvent delay, 2.2 min.) A HP-INNOWAX capillary GC column (30 m × 0.25 mm × 0.25 µm, Agilent) was used to separate the samples. Temperature programming was as follows: the initial temperature of the column oven was set at 90 °C, and then increased to 120 °C at a speed of 10 °C/min, to 150 °C at 5 °C/min and finally to 250 °C at 25 °C/min, where it was held for 2 min. Subsequently, MS conditions were set as follow: electron bombardment ionization source; iron source temperature, 230 °C; quadrupole temperature, 150 °C; electron energy 70 eV. The selected ion monitor (SIM) was to detect SCFAs. The MSD ChemStation software (version 2.0) was used to extract and correct the peak area and retention time of the chromatogram, representing the relative content and identification of SCFAs, respectively.
Genomic analysis
Extraction of Genomic DNA
Bacterial DNA was isolated from faecal stools using a QIAamp DNA Stool Mini Kit (QIAGEN, Germany) following the manufacturer's instructions. Genomic DNA was detected by 1% (w/v) agarose gel electrophoresis.
Bacterial 16S rRNA gene sequencing and annotation
The barcoded primers used to amplify 16S rRNA gene (V3-V4) were 319F/806R [27]. Sequencing library was generated using NEB Next®Ultra™DNA Library Prep Kit for Illumina (NEB, USA) following manufacturer. The library was sequenced on an Illumina HiSeq2500 platform (Illumina, USA) and paired-end reads were generated. Paired-end reads were merged into consensus fragments using FLASH (v1.2.11) [28], and then were assigned to each sample according to the unique barcodes. Clustering of operational taxonomic units (OTUs) was performed in Usearch (version 10.0) using the UPARSE-OTU and UPARSE-OTU ref algorithms [29]. Sequences with ≥97% similarity were assigned to the same OTUs. The Ribosomal Database Project (RDP) classifier was used for taxonomic annotation of OTUs [30].
Metagenome sequencing and annotation
Sequencing library was generated using TruSeq™ DNA Sample Prep Kit (Illumina) following manufacturer. Paired-end sequencing was performed on the Illumina HiSeq2500 platform. Host sequences were filtered out using BWA (http://bio-bwa.sourceforge.net) and remained reads were assembled using Megahit (https://github.com/voutcn/megahit). Annotations of taxonomy, functional pathway and bacterial virulence factors were performed with BLASTP (Version 2.2.31+) according to Non-Redundant Protein Sequence Database (https://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/), Kyoto encyclopedia of genes and genomes (KEGG) (http://www.genome.jp/kegg/) and virulence factor database (VFDB) (http://www.mgc.ac.cn/) respectively.
Bacterial profiling
QIIME (v1.9.1) was used for analyzing microbial a-density and b-density. Based on a-diversity, Chao index and Shannon index were used for presenting microbial diversity and richness respectively in each sample. Based on b-diversity, principal coordinates analysis (PCoA) was used to visualize dissimilarities of GM component among samples. Linear discriminant analysis effect size (LEfSe) was used to identify differential taxon between groups [31].
Quantification of glutamate decarboxylase in genera Escherichia and Bacteroides
Primers were designed against Escherichia- or Bacteroides- specific glutamate decarboxylase (GAD) DNA sequences from the National Center of Biotechnology Information (NCBI). Two orthologs of gad in Escherichia were detected by Escherichia-gadA-qF/qR and Escherichia-gadB-qF/qR; Bacteriodes-specific gad was detected by primer pair Bacteroides-gad-qF/qR (Table S6). The reaction mixture for amplification consisted of 2 µl of Roche Fast Start LightCycler Mastermix, forward and reverse primers (0.5 mM each), 3.2 mM MgCl2 and nuclease-free water to a final volume of 15 µl. The amplification cycles consisted of incubation at 95°C for 30s, at 57°C for 30s, and 72°C for 30s. Cycle threshold was measured and target gene concentration was analysed. Generation of a standard curve, determination of the qPCR efficiency, and calculation of the copy number were carried out with 7500 Fast System SDS v1.4 software (Thermo Fisher Scientific).
Animal study design
MaleC57BL/6J mice were housed in a pathogen-free facility under a 12-h light/12-h dark cycle. Mice were randomly divided into an experimental group and a control group and then housed separately. At postnatal day (P) 21, the experimental group was challenged with gut commensal E. coli (CICC20658, China Center of Industrial Culture Collection, China) by daily gavage at a dose of 108 per mice for 5 consecutive days. Meanwhile, phosphate buffered saline (PBS) was used as placebo in the control group. Subsequently, mice were housed for another 7 days for E. coli colonization. At P32, faecal samples from mice were collected and all mice were subjected to behavioral tests.
Principles for mouse faecal sampling and storage are similar to that for human studies. Each mouse was placed in individual sterilized box for defecating. During sampling, urine was wiped off immediately with an antiseptic swab where urination was found. Stool consistency was evaluated as previously described [32]. Accordingly, hard-formed faecal stool from each mouse was considered as normal stool and was subsequently clipped into a tube with sterilized tweezers. Soft-formed, unformed or urine-mixed samples were discarded.
Quantification of E. coli and Escherichia- -specific glutamate decarboxylase
Primers against the 16S rDNA sequence of gut commensal E. coli were designed as previously described [33]. The primers for Escherichia-specific gadA and gadB in human study were reused in current animal study. A same qPCR procedure was performed as described above.
Behavioral tests on mice
Three-chamber test Social preference and social recognition of mice were assessed in a three-chamber apparatus (60 x 40 x 20 cm, L x W x H) as previously described [34, 35]. Briefly, the apparatus was divided into three interconnected chambers with the left and right chambers contained one small cage each, while the middle chamber was empty. For habituation, the test mouse was first placed in the apparatus for a 10-min period. For evaluating social preference, the test mice could possibly interact with an age- and sex-matched stranger mouse (Stranger 1) that was placed in the cage of the left chamber, or stay with the empty cage of the right chamber during the second 10 min. For evaluating social recognition, a second stranger mouse (Stranger 2) was introduced into the previous empty cage of the right chamber in the third 10 min. The accumulative time that the test mouse spent in interacting with the empty cage, stranger 1 or stranger 2 was respectively recorded by EthoVision XT software (Noldus Information Technology, Leesburg, USA).
Marble-burying testRepetitive behavior of mice was assessed in a mouse cage (42 × 24 × 12 cm, L x W x H) laid with 5 cm-thick corncob padding as previously described [36]. Briefly, 20 glass beads (15 mm, diameter) were put into the cage and regularly divided into five rows (4 beads in each row). Then, the test mouse was placed in the cage for 30 min. The number of buried glass beads (being buried more than 50% of volume) was counted.
Open field test Grooming, voluntary movement and anxiety behavior of mice were assessed in an open box (40 x 40 x 40 cm, L x W x H) as previously described [35, 37]. Briefly, the open box was divided equally into 16 smaller grids and the central 4 grids were set as the central area (20 x 20 cm). Then, the test mouse was placed in the cage for 10 min. The grooming time of each mouse was recorded artificially. The speed, travelling distance and the time of each mouse that spent in the central area were calculated by EthoVision XT software.
Novel object recognition Recognition memoryof mice was assessed in a box (40 x 40 x 40 cm, L x W x H) as previously described [35, 38]. After 10-min habituation in the box, the test mouse was exposed to two identical objects for another 10 min. Then, one object was replaced with a novel object and the mouse was subsequent allowed to explore the objects for 10 min. The sniffing time at the proximity of each object within 2 cm or directly in touching the objects was recorded by EthoVision XT software.
Elevated plus maze The anxiety behavior of mice was assessed in a 1 m height platform consisting of 4 arms (two open arms and two closed arms crossed together) as previously described [35]. The test mouse was initially placed in the central area and their trails were recorded for 5 min. The accumulative time of the mouse in open arms and closed arms was calculated by Noldus EthoVision XT10 software.
Reciprocal Social Interaction As previously reported [39], the test mouse was placed in a new cage and exposed to an age- and sex-matched stranger mouse for 10 min. The time of social interactions between the two mice (e.g. close following, touching, nose-to-nose sniffing, nose-to-anus sniffing, and crawling over/under each other) was calculated.
Statistical analyses
Statistical analyses were performed by using R statistical software (v3.6.3) (www.R-project.org/) and SPSS (version 22). Variables of metabolite level, including the ratio of metabolites, gene abundance, and pathway abundance were normalized by "value/SD" before further statistical analyses. Wilcoxon rank-sum test and Chi-square test were used to compare the average of continuous variable and frequency of nominal variable, respectively, between ASD and TD, with p< 0.05 as significant. An obtained p-value of pathways and genes were adjusted by the Benjamini-Hochberg procedure, with false discovery rate (FDR) < 0.1 as significant. Partial least-squares discrimination analysis (PLS-DA) was exploited to compute the predicted potential biomarkers for ASD among metabolites, with the value of variable importance for the projection (VIP) > 1 as significant. Permutational multivariate analysis of variance (PERMANOVA) was used to address the difference of PLS-DA and PCoA plots, with p < 0.05 as significant. Receiver operating characteristic (ROC) analysis was used to test the performance of potential markers for ASD. Pearson correlation analysis and linear regression were performed to evaluate the correlation among GM abundance, metabolites levels, pathway abundance, and ASD rating score, with coefficient r as the indicator of relationship and p < 0.05 as significant.