Ethics
The study was conducted in accordance with the Declaration of Helsinki and adheres to CONSORT guidelines (http://www.consort-statement.org/). Ethical approval was granted by the ethics committee at Kiel University (D534/18). All volunteers provided written informed consent.
Study design and cohort sampling
For the presented randomized controlled intervention trial (Trial registration: DRKS, DRKS00015873. Registered 12 December 2018 - Retrospectively registered, https://www.drks.de/drks_web/navigate.do?navigationId=trial.HTML&TRIAL_ID=DRKS00015873), 42 healthy physically inactive German male and female volunteers aged 20 to 45 years and with a body mass index (BMI) between 20 and 35 kg/m2 were recruited for strength, endurance or control group participation (Fig. 1A) in October 2018. Exclusion criteria comprised regular medication intake, the intake of antibiotics six weeks prior to the study, known existing chronic diseases and regular exercise during six months prior to the study. The overall study duration was ten weeks (Fig. 1B). During the first week, participants came into the study centre for a comprehensive baseline assessment, comprising measurements of height, weight, blood pressure, pulse, waist to hip-ratio, hip circumference, and body composition. In addition, blood and urine samples were taken and wrist-band fitness trackers with dedicated smartphones were provided to the study participants. Subsequently, endurance levels were measured by a physical working capacity 170 (PWC) test. During the study period, participants were asked for six stool samples collected at home, sending to the laboratory immediately (for detailed sampling see Fig. 1B) and to fill in questionnaires for each of the stool samples regarding their health and diet the week before the sample was taken. Participants longitudinal data was synchronized so the intervention starts at day 20.
To study the effects of different type of exercise, participants were randomly divided into three groups, namely control, endurance, and strength, matching for gender, age, and body mass index (BMI) by MW. Participants of the control group were asked to maintain their general physical inactivity. However, a guided exercise program after the end of the study was offered to the control group as an incentive. During the six weeks long intervention, participants of the endurance group were required to run three times per week for at least 30 minutes. One session per week was supervised by a professional trainer. The intensity of the run was regulated by the usage of the Borg rating of perceived exertion (RPE), representing a “basic” endurance training. Participants of the strength group performed a whole-body hypertrophy strength training in the gym three times a week, of which two trainings per week were supervised. In brief, the participants had a five-minute warm-up on the treadmill, ergometer, or rowing machine before they started their training of approximately 30 minutes. One session consisted of six different exercises, each two for the legs, the chest, and the back in order to train the large and main muscle groups. The participants performed one warm-up set (supposed to be 50% of the load set weight) and one load set for every exercise. For the load set, the weight was chosen to ensure eight possible repetitions. If more than eight repetitions were possible on two consecutive training days, the participants were required to raise the weight in the following session. Both groups were asked to fill in a detailed training journal during the unsupervised training sets to ensure training intensity.
During the last week of the study (week 9), again assessments of height, weight, blood pressure, pulse, waist to hip-ratio, hip circumference and body composition were made for all participants. Blood and urine samples were taken and finally, endurance level was measured by a physical working capacity 170 (PWC) test (Fig. 1B).
In addition to the physically inactive volunteers of the intervention study, and for generating a data set from an “extreme exercise” group for comparison, 13 elite athletes (mainly cyclist and triathletes) were each asked to provide fecal samples. Table 1 summarizes the demographic and anthropometric characteristics of the study participants at the beginning of the study as well as the respective information on elite athletes.
Table 1 Baseline demographics and anthropometric data of the study participants.
Group
|
n
|
Sex (%m/%f)
|
Age
|
Weight (kg)
|
BMI
|
PWCa (km/h)
|
Control
|
11
|
36/64
|
33.4±7.9
|
78.7±18.2
|
26.9±5.6
|
8.5±1.7
|
Endurance
|
13
|
23/77
|
31.4±8.3
|
68±10
|
23.1±3.2
|
9±1.7
|
Strength
|
12
|
50/50
|
29.9±7.9
|
85.6±25.7
|
26.3±6.6
|
9.4±1.6
|
Elite
|
13
|
38/62
|
30±9.9
|
-
|
-
|
166.8±159.9
|
Average (Æ) values are shown and standard deviation is depicted as ±.
a Physical working capacity 170 test.
Measurement of blood analytes
Blood samples taken from each participant at the beginning and the end of the study period were analysed by complete blood counts at the Institute of Clinical Chemistry, UKSH Kiel, Kiel, Germany. Additionally, serum samples were used to measure blood concentrations of brain derived neurotrophic factor (BDNF) by using a commercially available Human BDNF ELISA Kit (Sigma-Aldrich, St. Louis, Missouri, US) at the end of the study. The full set of blood measurements is displayed in Suppl. figure 3. Serum samples were diluted 200-fold and ELISA was performed according to the manufacturers protocol.
Processing of data collected by the fitness trackers
Generally, raw data collected by current consumer-grade fitness trackers comprises movement intensities, step counts, heart rate, heart rate variation and, if implemented, additional data like activity types, such as sleep-related information. These data sets are forwarded by a corresponding application on the smart phone to servers operated by the vendor. Here data aggregation and the derivation of further health-related values like stress scores or sleep phase detection takes place. This processed data is used by the applications to illustrate a health and fitness status to the users.
In preparation for this study a number of consumer-grade fitness trackers were tested for data accuracy and accessibility especially regarding step counts and heart rate readings. The accuracy of the step counts was tested both with repeated walking tests with defined step counts, by bicycling rounds and reproducing various everyday situations to test for correct and false step counts, respectively. Heart rate detection was verified with medical heart rate monitors. After the decision for a suitable tracker, the infrastructure needed to receive the data was developed and tested. Data validation included the analysis of raw data on the tracker to ensure data integrity and statistical adequacy. In this study the GARMIN® Vívosport fitness tracker was used. The data was collected and uploaded by the GARMIN® Connect application on the provided smartphone equipped with a pseudonymized user account. To transfer the data in our research data warehouse, GARMIN® Health API Endpoints were developed (https://developer.garmin.com/health-api/overview/). To verify the data integrity the raw data was extracted from the device and verified to manifest unaltered in the data provided by the Health API. During and after the study the complete dataset was verified for possible errors and artifacts. Only values proven to be reliable in the data validation — steps, heart rate and sleep lengths — were used in this study. Sleep length data from participants with less than 10 days of reliable and complete sleep length information were disregarded. Age predicted maximum heart rate (APHRM) was calculated using the formula by Tanaka et al. [40]. Finally, the data was moved into a relational database schema for processing and to make it available both for interactive browser applications and statistical analysis.
Interactive Web Portal
Due to longitudinal nature of the data, an interactive web portal was developed using Springboot and highcharts libraries to visualize this data. The main goal of this portal was to aid data compliance for each individual participant. Study manager could login to view pseudonym participants, their group, and accompanying data. Suppl. Fig. 6 presents such data as timeline which allows study manager to have quick glance over missing data intervals. The radial graph in Suppl. Fig. 7 represents a visualization of observed sleeping behaviour. In addition to several other visualization around GARMIN® data, microbiome taxonomy data was also imported for all participants and was visualized to observe microbiome changes of an individual over six timepoints (Suppl. Fig. 8). In the long run, the overall idea is to extend this portal where study managers or researchers can visualize all the longitudinal data (activity tracker data, demographics, stool, questionnaires) in one place related to a single study.
Acquisition of dietary data
Participants were asked to fill in a comprehensive questionnaire regarding their usual diet before the start of the study as well as regular questionnaires regarding the diet within the week before a stool sample was taken.
Stool sample processing and sequencing
DNA of samples was extracted using the QIAamp DNA fast stool mini kit automated on the QIAcube (Qiagen, Hilden, Germany). Therefore, material was transferred to 0.70 mm Garnet Bead tubes (Dianova, Hamburg, Germany) filled with 1.1 ml InhibitEx lysis buffer. Bead beating was performed using the SpeedMill PLUS (Analytik Jena, Jena, Germany) for 45 s at 50 Hz. Samples were then heated to 95 °C for 5 min with subsequent continuation of the manufacturer’s protocol. Extracted DNA was stored at -20 °C prior to PCR amplification. Blank extraction controls were included during extraction of samples.
For sequencing, variable regions V1 and V2 of the 16S rRNA gene within the DNA samples were amplified using the primer pair 27F-338R in a dual-barcoding approach according to Caporaso et al. [41]. Stool DNA was diluted 1:10 prior PCR, and 3 µl of this dilution were finally used for amplification. PCR-products were verified using the electrophoresis in agarose gel. PCR products were normalized using the SequalPrep Normalization Plate Kit (Thermo Fischer Scientific, Waltham, MA, USA), pooled in equimolar amounts and sequenced on the Illumina MiSeq using v3 chemistry for 2x300bp paired-end reads (Illumina Inc., San Diego, CA, USA). Demultiplexing after sequencing was based on 0 mismatches in the barcode sequences.
Quantitative real-time PCR Veillonella atypica
For samples of elite athletes (n=12) and of all participants before start of intervention, additional quantitative RT-PCR analysis was performed to confirm findings described by Scheiman et al. 2019. Primers that were used for quantitative RT-PCR analysis have been described elsewhere [24]. Amplification was carried out in a LightCycler®480 instrument (Roche Deutschland Holding GmbH, Grenzach-Wyhlen, Germany) using the SYBR® Green I Mastermix (Roche Deutschland Holding GmbH) according to the manufacturer's protocol. Absolute quantification of strain abundance was calculated with the LightCycler® 480 Software, Version 1.5 and by using internal standard curves with DNA from those strains purchased by the DSMZ (DSM 20739, type strain Veillonella atypica). The master mix was prepared according to the manufacturers protocol and the following amplification protocol was used: initial denaturation 10 min 95 °C, 30 sec 95 °C and 60 sec 60 °C for 50 cycles 5 and final melting curve analysis from 65 to 95°C (1°/sec).
Sequence data processing
Data processing was performed using the DADA2 version 1.10 [42] workflow for big datasets (https://benjjneb.github.io/dada2/bigdata.html) resulting in abundance tables of amplicon sequence variants (ASVs). Briefly, all sequencing runs were handled separately (workflow adjusted for V1-V2 region can be found here: https://github.com/mruehlemann/ikmb_amplicon_processing/blob/master/dada2_16S_workflow.R) and finally collected in a single abundance table per dataset, which underwent chimera filtering. ASVs underwent taxonomic annotation using the Bayesian classifier provided in DADA2 and using the Ribosomal Database Project (RDP) version 16 release. Samples (n=1) with less than 10,000 sequences were not considered for further analysis.
Statistical analyses of steps, sleep length and diet
Statistical tests were applied to answer two question about within group variation: (i) “Is there a group-wise change from the start of the study until the end of the physical activity intervention?” and (ii) “Is there a group-wise change towards the end of the intervention period and after it?”. For both questions, the same model design was applied “y ~ participant_id + day”, where y represents the dependent variable in question. Participant identifiers were included in the formula to control for effects of the affiliation of samples. Models aiming to answer question i were applied to data points that were taken from the start of the study until the last day of the physical activity intervention. Models aiming to answer question ii were applied to data points that were taken within an interval of 15 days before and 15 days after the last day of intervention. Tests were carried out within each group.
Within group variation in means of daily steps and sleep length was analysed using linear models. Patterns of dietary components were summarized by principal component analysis (PCA) using the R package factoextra version 1.0.5 [43]. Changes in overall dietary pattern were analysed using the first principal component as dependent variable on linear models. Changes in individual dietary components were analysed using linear models on logarithm-transformed data. For the latter, false discovery rate multiple correction was applied. False discovery rate P-value correction in accordance with Benjamini and Hochberg was employed within each test set for the variable “day”. P-values were calculated using sequential ANOVA. For P values and adjusted P values a significance threshold of 5% was applied. Statistical designs were used as described above.
Statistical analyses of microbiome data
Statistical tests were applied to answer two questions stated in the previous section (Steps, sleep length and diet analysis). For microbiome data from participants of the groups control, endurance and strength, the following model design was used: “y ~ participant_id + PC1 + day”. Here, the first principal component (PC1) derived from diet questionnaires was included to control for possible effect of diet on the microbiome. The differences between single timepoint microbiome data from the group of elite athletes and physically inactive participants were tested using the model design "y ~ PC1 + group”. Samples from physically inactive participants were selected from the first faecal collection and matched for age and sex.
Microbiome alpha diversities measures, richness and diversity, were estimated using Chao1 and inverse Simpson (InvSimp). These measures were calculated on rarefied ASV table to 10.000 sequences per sample. For each group, the effect of exercise intervention on the alpha diversity (logged units) was tested using linear models. P-values were calculated using sequential ANOVA, with an alpha level of 5%. Beta diversity was calculated on rarefied ASV table with Bray-Curtis dissimilarity. Nonmetric multidimensional scaling (NMDS) was applied to visualize the level of dissimilarity of samples. Differences in community structure were tested using adonis2, with significance assessed sequentially (by = “term”) with an alpha level of 5% and after 999 permutations. Rarefaction, calculation of diversity indexes and NMDS were performed with the R package vegan version 2.5-5 [44].
Difference in ASV average abundances was tested using DESeq2 version 1.24.0 using default parameters [45]. Prior testing, ASVs were pre-filtered to be present in at least 10% of the samples and with a mean of 10 sequences per sample in the rarefied ASV table. This filtering was applied in each set of samples being tested. Library sizes were estimated using the method “poscounts”. For testing the variation in ASV occurrence, same filtering criteria were used. Occurrence was modelled employing binomial linear models. P values for the term “day” were calculated based on Wald test as implemented in the R package survey version 3.1-11 [46]. False discovery rate multiple test correction was applied in each set of tests for both abundance and occurrence models. Adjusted P value significance cut off was 0.05.
The differences in the qPCR-based DNA concentration of Veilonella in physically inactive and elite athletes (n= 12) was tested using Wilcoxon signed-rank test. All tests and data manipulation were conducted in R version 3.6.2.
Statistical analyses of biometrics and blood profile
Biometrics and blood profiles were tested in separated batches. Within group differences in measures taken before and after intervention were tested using paired Wilcoxon signed-rank test. Within each group, false discovery rate P value correction was applied. Adjusted P value significance cut-off was 0.05. Rank-biserial correlation was calculated as described in [47]. Difference in participants’ average daily steps, accumulated hours of age predicted maximum heart rate (APHRM) and average sleeping hours were compared among groups using Wilcoxon tests and visualized with the R package ggpubr version 0.2 [48].