Data
We used the same cohort of 5047 individuals from the TwinsUK cohort as in the original paper to create and validate the dietary indexes. TwinsUK is the UK’s largest cohort of mono- and di-zygotic twins. All participants with a completed Food Frequency Questionnaire (FFQ) as of 2016 were included in analysis.
Dietary index creation
The original Healthy Eating Index (HEI) methodology created by Guenther and colleagues [13] was modified to consider dietary data derived from an FFQ designed for UK populations; we use the same variable created for the previous paper.
Dietary Inflammatory Index (DII) creation
The DII is based on literature published through 2010 linking diet to inflammation. Details regarding DII are described in details elsewhere [3]. Briefly, 1,943 eligible peer-reviewed primary research articles published through 2010 on the effect of dietary factors on six inflammatory markers (IL-1β, IL-4, IL-6, IL-10, tumour necrosis factor-a (TNF-α), and CRP) were identified and scored to derive the component-specific inflammatory effect scores for 45 dietary factors (i.e., components of DII), which comprised macronutrients, micronutrients and some foods or bioactive components such as spices and tea. To avoid the arbitrariness resulting from simply using raw intake amounts, all individual self-reports for each DII component in the study were standardized to a world database consisting of dietary intake from 11 populations living in different countries across the world. The standardized dietary intake was then multiplied by the literature-derived inflammatory effect score for each DII component, and summed across all components to obtain the overall DII score. Higher DII scores represent more pro-inflammatory diets, while lower (i.e., more negative) DII scores indicate more anti-inflammatory diets [3]. Here we present results for the DII adjusted for energy (E-DII) using the density approach wherein all nutrients are converted to per 1000 kcal intake and as a result energy is not part of the E-DII calculation. This was a similar approach to that in the HEI and we felt it the better comparator.
Validation of the DII in our cohort
As before, we validated the DII in our study using Wilcoxon rank sum tests to assess the extent to which the DII distinguished smokers from non-smokers (n=3226, due to longitudinal differences in sample questionnaires), those over and under 50 years of age, and men and women. Indices were assessed as the primary explanatory variable against health measures, body mass index (BMI-weight(kg)/height(m)2) (n=4428) and a frailty index calculated using the Rockwood method [14] (n=4553); coverage of the frailty index has increased since the last submission due to increase in received questionnaire data; all data are thought to be missing at random and due to differences in longitudinal sampling. Linear models including BMI and frailty were adjusted for age, twin relatedness and sex.
Microbiota analysis
In contrast to the original paper, here we present results from microbiota data where 16S sequences have been re-analysed to produce amplicon sequence variants (ASVs). ASVs were created following the following the DADA2 pipeline [7]. Briefly, DNA sequences were demultiplexed, and separate forward- and reverse-read files were generated for each sample. The DADA2 pipeline was applied to each sequencing run separately, until the final merge step. Quality of sequences was assessed, with ends trimmed to remove poor quality reads, error estimated within-sample for forward and reverse reads, and then the ASV algorithm applied. Forward and reverse ASVs were joined, and the total dataset merged. Chimeras were removed. Taxonomic assignment was via SILVA 1.3.2.
Due to the differences in analytical pipelines of ASVs in this analysis rather than OTUs in the previous, more samples failed to achieve quality control thresholds. We wanted only to use samples considered in the previous study (i.e., rather than rematching individuals to different samples); as such, this resulted in 1853 samples used for analysis rather than 2070 used in the original study. In the previous study we considered four indexes of species diversity, here, Chao1 was not considered as it is an inappropriate index to use with ASVs (due to the lack of singletons); a richness measure remains as ‘Observed ASVs’, as does use of Shannon diversity and Simpson’s Index. All indexes were calculated using the ’phyloseq’ package in RStudio v1.1.423 [15,16] on untrimmed, untransformed ASV tables following suggestion of McMurdie and colleagues [8]. Mixed effects models were constructed in RStudio using the ‘lme4’ package [17] with each alpha diversity measure as the response variable and the primary explanatory variable as the HEI or the E-DII were adjusted for age at microbiome sample, BMI at microbiome sample and sequencing depth. Only the technician who extracted the sample was considered as a random factor; the other random factors considered in the initial analysis explained so little of the variance associated with the indexes that they prevented appropriate model fit, and were thus not included. All variables were scaled prior to model inclusion and standardised coefficients are reported.
Differences in abundance of ASVs, genus and phylum (i.e., ASVs collapsed to genus and phylum level) were assessed using DeSeq2 [18], adjusted for library size. Similar to the previous analysis, the first 10 PCoAs of weighted-UniFrac distance (calculated on variance transformed ASVs with negative abundances set to 0 for the ordination) were extracted and used as response variables in mixed effects models, adjusted as in ASV diversity analysis, with the only random effect considered (again, because inclusion of those in the first analysis prevented adequate model fit) to be the sequencing run. Finally, twin pairs with greater than one standard deviation and within different quintiles for their dietary score (HEI and E-DII) were identified, and differences in ASV abundances were compared using paired Wilcoxon rank-sum tests, and false discovery rate adjusted using the Benjamin-Hochberg method.