Study population and sample collection
TMCS is a Japanese cohort study, initiated in April 2012 (Tsuruoka City, Yamagata Prefecture, Japan), involving 11,002 participants aged 35 to 74 years [10-14]. The participants were attendees of the annual municipal or workplace health checkup programs held at four sites in the city at baseline (April 2012–March 2015). The study design is illustrated in Fig1.
TMCS was particularly designed to discover metabolomics biomarkers for common diseases and disorders related to environmental and genetic factors. All participants completed a comprehensive questionnaire on lifestyle, dietary habits, and medical history. Biological samples including serum, plasma, urine, and deoxyribonucleic acid (DNA) were collected, and medical examination data recorded during the health checkup programs were also collected at recruitment. Information on alcohol consumption and smoking habits, dietary pattern, stress, and level of physical activity was collected through a standardized self-management questionnaire, and these data were verified in person. The procedures for recruitments of the TMCS have also been described in detail in previous studies [10-14]. To avoid variation due to fasting state and circadian rhythm, urine samples were collected from each participant in the morning between 8:30 am and 10:30 am after an overnight fast.
QC samples were prepared by mixing approximately 10 randomly selected participant samples, and analyzed every 10 runs in each batch to evaluate analytical validation. Finally, the QC samples were measured 752 times in 69 batches for cation analysis, and 768 times in 71 batches for anion analysis. Baseline participant samples were analyzed to evaluate the variance of participants’ metabolites in the population. A total of 6,720 samples from included participants were analyzed from the TMCS baseline cohort: first and second year (April 2012–March 2014). Follow-up participants were not included in these samples.
To compare the metabolite concentrations in the spot urine samples with that in the 24-hour collected urine samples as a reference, a sub-cohort consisting of 32 TMCS participants within the cohort was set in 2013. These participants also answered a questionnaire on lifestyle, dietary habits, and medical history. In the sub-cohort analysis, ascorbic acid was added to the samples during pretreatment.
Sample preprocessing, instruments and analytical conditions
The samples were initially vortexed for 30 s, followed by centrifugation at 2,300 µg for 5 min at 4 °C. They were then diluted according to the creatinine concentration (Table S1) with Milli-Q water and Milli-Q water containing internal standards (2 mM each of methionine sulfone and camphor-10-sulfonic acid). We found that a creatinine concentration of < 10 mg/dl does not cause ion saturation in the mass spectrometer. Hence, we set this as the upper limit for diluting urine.
Mass spectrometry-based metabolomic profiling was performed with fasting urine samples using capillary electrophoresistime-of-flight mass spectrometry (CE-TOFMS). CE-TOFMS analysis of cationic and anionic metabolites was performed as described previously [20,21,22]. Briefly, cationic metabolites were separated on a fused-silica capillary column (50 μm i.d. × 100 cm total length) filled with 1 M formic acid as the electrolyte, and a methanol/water (50%, v/v) containing 0.01 μM hexakis(2,2-difluoroethoxy)phosphazene (Hexakis) was delivered as a sheath liquid at a rate of 10 μL/min. The capillary temperature was maintained at 20°C. The sample solution was injected at 5 kPa for 5 s, and a positive voltage of 30 kV was applied. ESI-TOFMS was conducted in the positive ion mode, and the capillary, fragmentor, skimmer, and Oct RF voltages were set at 4,000, 75, 50, and 500 V, respectively. The nebulizer gas pressure was configured at 7 psig and the heated nitrogen gas (300 °C) was supplied at a rate of 10 l/min. Anionic metabolites were separated using a commercially available COSMO(+) capillary (50 μm i.d. × 105 cm, Nacalai Tesque, Kyoto, Japan) filled with 50 mM ammonium acetate (pH 8.5) as the electrolyte, and ammonium acetate (5 mM) in 50% (v/v) methanol/water containing 0.01 μM Hexakis was delivered as sheath liquid at a rate of 10 μL/min. The sample solution was injected at 5 kPa for 30 s, and a negative voltage of 30 kV was applied. ESI-TOFMS was conducted in the negative ion mode, and the capillary, fragmentor, skimmer, and Oct RF voltages were set at 3,500, 100, 50, and 500 V, respectively. Other conditions were identical for the cationic metabolite analysis. In both modes, the automatic recalibration function was used to correct the analytical variation of exact masses for each run as described previously [22].
Mass spectra were acquired at a rate of 1.5 cycles/s over a 50-1,000 m/z range.
Statistical analysis
Since missing values were created by being below the measurement limit, half of the lowest detected values were imputed for metabolites that were not detected [23]. As we performed previously [13], inter- and intra-batch variance for each metabolite concentration in the QC samples was calculated to evaluate the reproducibility of the data. To control the effects of the batch, a linear mixed model was formulated, as shown in equation (1).
The observed metabolite concentration (Y), inter - and intra-batch variance for each metabolite (μ), random effects common to each batch (B), and residual error (ε) are defined in the formula. We calculated the coefficient of variation (CV) by dividing the standard deviation as estimated from this model by the mean. Pearson correlation coefficients between the inter- and intra-batch CV were then calculated. These analyses were also conducted with participant samples to assess inter- and intra-batch variance. The intraclass correlation coefficient (ICC) was calculated to assess the reliability of the metabolite concentrations [24,25]. This value was calculated from the variance of the measurement errors and the total variance,
where σE2 is the variance of the measurement errors and σT2 is the total variance, as shown in equation (2). Although we could not calculate the ICC for participant samples as there were no replicates, we calculated technical errors from a large number of replicates for QC samples considered to be representative of the population samples. We made an approximate calculation of ICC, substituting the CV of QC samples for error variance and CV of participant samples for the total variance, as shown in equation (3).
When creatinine correction was performed, it is well known that substances tend to be lower in concentrated urine samples than in diluted urine samples. Second, some diseases and medications can cause fluctuations in urine creatinine levels. Therefore, a sensitivity analysis was conducted excluding samples with creatinine values >3.0 g/g/L or <0.3 g / g/L from participants [26].
For sub-cohort analysis, metabolites in 24-hour urine samples and spot urine samples were compared among individuals, and Pearson's correlation coefficient was calculated for each individual. To account for major factors that may affect urinary metabolite concentrations, a regression analysis was performed, and the slope was compared between the spot and 24-hour urine samples. The explanatory variables included age, sex, alcohol consumption, and smoking, all of which are known to affect metabolite concentrations [10,27,28].
Statistical analyses were performed using R version 3.5.2 (2018-12-20) (R Core Team 2018, 2018, The R Foundation for Statistical Computing, Vienna, Austria).
Ethical approval
This study was approved by the Medical Ethics Committee of the School of Medicine, Keio University, Tokyo, Japan (Approval No 20110264 and No 20130207 for the entire cohort study and the sub-cohort one, respectively). Informed consent was obtained in written form from all the participants included in the studies. All research was performed in accordance with the relevant guidelines and regulations.