Young adults aged 18 to 24 years, both ages included, were recruited from the Department of Unipolar Depression at Aalborg University Hospital, Aalborg and from the Psychiatric Department, Horsens Regional Hospital, Horsens, Denmark between December 22nd 2017 – March 13th 2020. Patients were screened and diagnosed according to the 10th revision of the International Classification of Diseases (ICD-10) [47] criteria for depression (henceforth referred to as the MDD group) by a psychiatrist. Exclusion criteria were as follows: previous or current medical antidepressant treatment; psychiatric disorders other than MDD; neurological disorders; gastrointestinal disorders; endocrine, nutritional or metabolic diseases; infectious or parasitic diseases within a month prior to inclusion; use of antibiotics, probiotics, prebiotics or synbiotics within three months prior to inclusion; pregnancy within a year prior to inclusion; substance or alcohol abuse; specific dietary habits, such as vegetarian or paleolithic diets (with a complete list of exclusion criteria in Supplementary Material 1).
Individuals without depression (henceforth referred to as the nonMDD group) aged 18 to 30 years, both ages included, were recruited through social media, and on the internal personnel webpage of the North Denmark Regional Hospital. Exclusion criteria were identical to those for patients with MDD, in addition to no current or previous history of psychiatric disorders or antidepressant use. Confirmation of the medical history of both MDD and nonMDD participants was performed retrospectively through data retrieved from the Danish Electronic Patient Journal at least one year after the last patient with MDD was recruited for the study.
Questionnaires and data acquisition
Major Depressive Inventory
At each sampling point, all participants were instructed to use the self-screening 10-item MDI questionnaire to evaluate severity of their depressive symptoms [48]. Based on total scores, participants were grouped into four categories: no depression (0-20), mild depression (21-25), moderate depression (26-30) and severe depression (31-50).
Bristol stool scale
Participants were asked to rate their stool sample consistency using the Bristol Stool Scale (BSS), which was used as a proxy for bowel transit time [49]. It consists of seven points, with one being stool with long transit time (constipation) and seven being very short transit time (diarrhea). Participants were instructed to collect a fecal sample only if the BSS score of the sample was between 3-5, as severely decreased or increased transit time would affect the gut microbiota composition [50].
Demographic and clinical data
At inclusion, participants gave information on their age, height, weight and smoking habits. Participants were asked to fill out a baseline questionnaire, which focused on appetite, dietary habits, bowel movements, and gastrointestinal symptoms. Questions included the number of daily meals, estimation of appetite and self-evaluation of their caloric intake in accordance with Danish guidelines. Additionally, they were asked about toilet habits. In the questionnaire participants received during the four- and twelve-weeks follow-ups, the questions focused on changes experienced in the parameters compared to the baseline measurements.
Microbiota characterization by 16S rRNA gene sequencing
Fecal samples were collected in the home of the participants and stored in a domestic freezer (-20 °C) for a maximum of 72 hours until delivery in a cooling bag to the laboratory. Following reception at the laboratory, fecal samples were stored at -80 °C. Total DNA was isolated from 250±25 mg fecal samples, using the QIAamp Powerfecal DNA kit (QIAGEN) with automation on a QIAcube® (QIAGEN) as previously described [51]. 16S rRNA gene sequencing was performed by DNAsense ApS Denmark as previously described [52]. In brief, sequencing libraries were constructed using a standardized primer set (515FB and 806RB [53, 54]) targeting the 16S rRNA V4 region, followed by a PCR where unique barcoded primers were added to all sequencing libraries. The resulting DNA was sequenced (2x300 bp) on a MiSeq platform using MiSeq Reagent kit V3 (Illumina) with an additional 10 % PhiX control library (Illumina) to estimate error rate during sequencing.
Measurement of fecal calprotectin, LBP and blood plasma immune markers. Blood was collected from participants within 24 hours from delivery of the fecal sample. Plasma was isolated by centrifugation and stored at –80°C until further analysis. Lipopolysaccharide-binding protein (LBP) was used as a proxy for the presence and concentration of LPS [55]. LBP was measured in peripheral blood plasma samples in duplicates with the RayBio® Human LBP ELISA Kit (RayBiotech, USA) according to manufacturer’s instructions. All incubation steps were performed at room temperature using an orbital shaker (Thermo Fisher Scientific) set to 150 rpm. Signal intensity was measured at 450 nm for LBP on a Fluostar Omega Plate Reader (BMG Labtech, Germany). Concentrations were calculated using the four-parameter logistic regression method, as per manufacturer’s recommendation.
A total of 53 immune markers were analyzed in blood plasma, using electrochemiluminescent immunoassays Meso Scale Diagnostics technology to evaluate the degree of systemic inflammation in the participants. A combination of seven panels were used, namely the V-PLEX Angiogenesis Panel 1 Human, V-PLEX Chemokine Panel 1 Human, V-PLEX Cytokine Panel 1 Human, V-PLEX Cytokine Panel 2 Human, V-PLEX Proinflammatory Panel 1 Human, and V-PLEX Vascular Injury Panel 2 Human. Analyses were performed at the Department of Clinical Immunology, Copenhagen University Hospital, Denmark according to manufacturer’s protocol. Positions on the plates were randomized and samples analyzed in triplicates. Results were log2 transformed and displayed as fluorescent intensity. To adjust for variations across plates, signal intensities were median-normalized across the individual plates. For both ELISA and Mesoscale measures, assay diluent was used as negative control. Signals below that of blank controls were included into the analyses.
Bioinformatics
PhiX sequences were removed and read pairs were demultiplexed by the USEARCH v.11 pipeline [56]. The resulting demultiplexed sequences were imported into the QIIME2 2020.8 bioinformatics platform [57]. To build amplicon sequence variants (ASVs), primers were filtered from forward reads, followed by truncation to 250 base pairs and denoising with DADA2 using standard parameters. Negative controls were defined as samples with <40.000 reads and removed from downstream analysis. All ASVs were aligned with MAFFT [58] using q2-alignment, and a phylogenic tree was constructed hereof using fasttree2 [59] implemented in q2-phylogeny. Taxonomy was assigned using the Naïve Bayesian classifier, implemented in q2-feature-classifier, trained against the SILVA 138 SSU reference database [60]. R version 4.0.3 was used for subsequent analyses through the Rstudio IDE (http:///www.rstudio.com). The generated amplicon data was investigated using the packages phyloseq v1.32.0 and ampvis2 v2.6.6. α-diversity metrics were analyzed using ASV richness, Faith’s phylogenic diversity, and Shannon diversity index. β-diversity indices included principal coordinate analysis (PCoA) of Bray-Curtis dissimilarity, as well as weighted and unweighted UniFrac [61]. Bacterial β-diversity was analyzed using permutational multivariate analysis of variance (PERMANOVA) with 999 permutations, as implemented in ADONIS and tested for variability using Betadisper. The linear discriminant analysis effect size (LEfSe) method [62] was used to determine the most differentially abundant taxa between the different diagnostic groups, or time points.
Statistical analyses
For continuous data such as age, BMI, LBP and cytokine concentrations, distribution and variance were determined using Shapiro-Wilks test and Bartlett’s test, respectively. Depending on the normality of distribution and evenness of variance, data was tested using either Student’s t-test followed by Tukey’s post hoc test, or Mann-Whitney U test followed by Dunn’s post hoc test with correction for false-discovery rate using Benjamini-Hochbergs procedure. For paired data, such as the MDI score and longitudinal measurements of LBP and cytokines, a mixed effect model was used to account for missing data. For correlation analysis between inflammatory markers, LBP and β-diversity measures, Spearman’s rank correlation coefficient was used and the monotony of the correlation was assessed visually using scatterplots. For univariate statistics, the null hypothesis was rejected if p<0.05, whereas for multivariate statistics, a Benjamini-Hochberg adjusted p-value (q)<0.05 was needed.
Ethics
The study was conducted in accordance with the Declaration of Helsinki and approved by the North Denmark Regional Ethical Committee (reference: N-20170056). The categories of the person data collected in the project are registered in the processing activities of research in the North Region of Denmark in compliance with EU GDPR article 30. Written and oral informed consent was obtained from all participants.