We applied TRIPOD guidelines for the development and validation of prediction models[27,30] (for detailed protocol see Supplementary information. TRIPOD Checklist for Prediction Model Validation). Results from CPETs collected between 2013-2021 were retrospectively analyzed. Maximal-effort examinations consisted of the treadmill (TE) or the cycle ergometry (CE) tests, paired with body composition (BC) analysis took place in the medical clinic (www.sportslab.pl, Warsaw, Poland). Tests were performed on an individual request as a part of regular endurance assessment or training monitoring.
Cardiopulmonary exercise testing protocol
Cardiopulmonary exercise tests (CPET) were preceded by body mass (BM) and fat mass (FM) analysis with 5 kHz/50 kHz/250 kHz electrical bioimpedance method on the body composition (BC) monitor (Tanita, MC 718, Japan. Conditions during BC and CPET were: 40 m2 indoor, air-conditioned area, 40–60% humidity, temperature 20–22°C, altitude 100 m MSL. Endurance athletes (EA) were instructed via e-mail on how to prepare: avoid any demanding exercises 24 hours before CPET, consume a high carbohydrate meal and hydrate with isotonic beverages 2-3 hours earlier, and exclude any stimulants or caffeine on the day of the procedure.
Cycle ergometry (CE) examination was performed on a cycle ergometry Cyclus-2 (RBM elektronik-automation GmbH, Leipzig, Germany) and treadmill (TE) examination was conducted on a mechanical treadmill (h/p/Cosmos quasar, Germany). CPET scores were measured breath by breath during 10-s intervals using a Hans Rudolph V2 Mask (Hans Rudolph, Inc, Shawnee, KS, USA), a gas exchange analyzer Cosmed Quark CPET (Rome, Italy), and dedicated manufacturer’s software (from PFT Suite to Omnia 10.0E.) HR was measured via ANT and a torso strap as a part of the Cosmed Quark set (product accuracy comparable to ECG; ± 1 bpm.). The CPET device was calibrated with reference gas (16% O2; 5% CO2) and turbine flow for each person separately, according to manufacturer recommendations. Equipment software was regularly actualized between 2013-2021. Three gas analyzing devices were utilized and each one has been changed after 36-48 months. Every part of CPET equipment was periodically verified by manufacturer employees to keep their mechanical certificates valid. Blood lactate (LA) was assessed with the usage of Super GL2 analyzer (Müller Gerätebau GmbH, Freital, Germany). The instrument was also individually prepared before each round of analysis and calibrated with reference solution before each sample set.
Exercises begin with a 5-min. warm-up (walking or pedaling with minimal resistance). Participants' endurance capacities were used to assess starting load. The initial power for CE was 60-150W and was increased in 2 min. intervals by 20-30W. The initial speed for TE was 7-12 km·h-1 (described by a person as a “conversation pace”) at 1% inclination. The pace was raised by 1 km·h-1 every 2 min. Observer verbally encouraged athletes to keep effort as long as possible due to assess their endurance most exactly. Achievement of oxygen uptake (VO2) or heart rate (HR) plateau, or volitional inability to maintain intensity were reasons for test termination. LA was measured by taking a 20 µL blood sample from a fingertip: directly prior to exercises, after any resistance or pace modification, and 3 min. after termination. Samples were obtained without an interruption in CE and TE tests. Before a proper sample was obtained, the first drops were gathered in a swab. HR (not averaged) was recorded at the highest point during intervals and used in further analysis . Maximal oxygen uptake (VO2max) was defined as an averaged maximum oxygen uptake during the 15-s period at the end of the CPET.
The rigorous inclusion/exclusion process was applied to narrow the validation group to only those EAs who achieved maximum exertion during CPET and were free of any possible VO2max alleviating factors (see Figure 1. Flowchart of the inclusion-exclusion and further groups classification process).
6,439 EAs underwent CPET. Participants were eligible for preliminary inclusion if they had: (1) experience in regular running or cycling training ≥3 months, (2) age ≥18 years, (3) ≤±3 standard deviations (SD) from mean for all of the testing variables (extreme outliers were excluded), (4) lack of any acute or chronic medical condition (also musculoskeletal injuries, or addictions), (5) not taking any medications, (6) not being an active smoker.
Maximum exertion in CPET was defined as fulfilment ≥6 criteria: (1) respiratory exchange ratio (RER) ≥1.10, (2) present VO2 plateau (growth <100 mL·min-1 in VO2 with more increased running or cycling intensity), (3) respiratory frequency (fR) ≥45 breaths·min-1, (4) declared exertion during CPET ≥18 in the Borg scale, (5) lactate concentration (LA) ≥8 mmol·L-1, (6) growth in speed/power ≥10% of RCP after exceeding the respiratory compensation point (RCP), (7) peak heart rate (HRpeak) ≥15 bpm below predicted maximal heart rate (HRmax).
Finally, 5,260 EA met all inclusion criteria. The population was divided between males and females into four age groups: 18-30; 31-45; 46-60, ≥61 years, and 4 endurance groups: HTEA, REA, LTEA, and “transition”. Endurance classification was conducted based on the speed (km·h-1) or power (W·kg-1) at the RCP calculated independently for each sex. Speed/power at RCP was a variable-of-choice because it is currently described as a parameter most closely corresponding to the critical endurance capacity[33,34]. Moreover, the selection of a variable different from VO2max to the classification of participants in terms of their endurance capacity, enabled to make group assignments independent of the factor directly validated in the study. Participants with >+1.5 SD were classified as HTEA (n=309), <+0.5SD/>–0.5SD as REA (n=2,033), <–1.5 SD as LTEA (n=339). To precisely distinguish endurance subgroups, those placed between ≥+0.5SD/≤+1.5SD and between ≤–0.5SD/≥–1.5SD were classified as “transition” (n=2.579). Models’ validation was conducted on each of the age and endurance cohorts independently (except the „transition” group) both for TEVO2max and CEVO2max.
Selected prediction models
Candidate models were found from previous systematic reviews for CPET testing (up to February 2019)[35,36]and additional literature search in PubMed, MEDLINE, EMBASE, Scopus, and Web of Science databases (for a period between March 2019- December 2021 and meta-analyses) for keywords: Cardiopulmonary exercise testing, Cardiorespiratory fitness, Exercise testing, VO2max, VO2peak.
Exclusion criteria were: (1) not reporting VO2max parameters, (2) usage of other ergometers than CE or TE during CPET, (3) consideration of parameters not possible to verify in our sample (declared physical activity level, time to exhaustion), (4) generating unviable results multiple times (<0 or >100 mL·min-1·kg-1 VO2peak), (5) being derived from pediatric (the oldest participant <18 years old) or geriatric population (the youngest participant ≥61 yrs.), (6) being derived before 01.01.2000, (7) not reporting R2 from internal/external validation, (8) being derived from group <1000 participants (for the general population) or <200 participants (specifically for EA population), (9) methodological quality <7 points according to ATS/ACCP guidelines, (10) usage of other testing technique than breath by breath, (11) CPET protocol not carried out following the recommended clinical ATS/ACCP guidelines.
Moreover, the Wasserman et al. model was validated in the study due to its well-established reputation. Equations from 2 meta-analyses[38,39] were also considered because of their wide range of applications for EA. 13 equations from 8 different publications were included in the analysis. Their detailed characteristics are presented in the supplementary material (Supplementary table 1. Prediction equations included in the validation).
Baseline statistics were exported into the Excel file (Microsoft Corporation, Washington, USA) and are presented as mean (±SD and 95% CI) or frequency (percentage) for categorical variables, and median for continuous variables. Differences between subgroups (all continuous variables) were analyzed using the ANOVA test-of-variance and post-hoc HSD Tukey test. There was not any missing data in the whole population. Thus, an entire cohort has been validated.
External validation was conducted by following the recommendations for the validation and interpretation of diagnostic prediction models(30). In summary, we assessed equations accuracy by comparisons between the originally established formulas and data obtained directly from CPETs and BC examinations (e.g., VO2max, BMI). Linear model regressing measured VO2max on pVO2max was generated for each equation. Performance, considered as the proximity of the observed and expected CRF, was evaluated with the usage of the R2, root mean square error (RMSE). Additionally, calibration slope (the slope of a linear regression model that includes the model’s linear predictor as the only covariate parameter estimate where 1 being ideal; C1), and calibration-in-the-large (mean observed compared to mean predicted value where 0 being ideal; C2) were calculated.
Ggplot 2 package in RStudio (R Core Team, Vienna, Austria; version 3.6.4), originally written Python script (Python Software Foundation, Delaware, USA; version 3.10.1), and STATA software (StataCorp, College Station, Texas, USA; version 15.1) were used in statistical analysis. The significance borderline was at a two-sided p-value <0.05.