A Novel Method for Identifying a Parsimonious and Accurate Predictive Model for Multiple Clinical Outcomes

doi:10.21203/rs.2.20249/v1

Download PDF

Research article

A Novel Method for Identifying a Parsimonious and Accurate Predictive Model for Multiple Clinical Outcomes

https://doi.org/10.21203/rs.2.20249/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 27 Mar, 2021

Read the published version in Computer Methods and Programs in Biomedicine →

Version 1

posted

You are reading this older preprint version

Read the latest preprint version →

Background: Most methods for developing clinical prognostic models focus on identifying parsimonious and accurate models to predict a single outcome; however, patients and providers often want to predict multiple outcomes simultaneously. For example, older adults are often interested in predicting nursing home admission as well as mortality. We propose and evaluate a novel predictor selection method for multiple outcomes.

Methods: Our proposed method selected the best subset of common predictors based on the minimum average normalized Bayesian Information Criterion (BIC) across outcomes: the Best Average BIC (baBIC) model. We compared the predictive accuracy (Harrell’s C-statistic) and parsimony (number of predictors) of the baBIC model with a subset of common predictors obtained from the union of optimal models for each outcome (Union model). We used example data from the Health and Retirement Study (HRS) to demonstrate our method and conducted a simulation study to investigate performance considering correlated and uncorrelated outcomes.

Results: In the example data, the average Harrell’s C-statistics across outcomes of the baBIC and Union models were comparable (0.657 vs. 0.662 respectively). Despite the similar discrimination, the baBIC model was more parsimonious than the Union model (15 vs. 23 predictors respectively). Likewise, in the simulations with correlated outcomes, the mean C-statistic across outcomes of the baBIC and Union models were the same after rounding: 0.650, and the baBIC model had an average number of predictors of 13.8 (95% CI: 13.7, 13.9) compared with 21.6 (95% CI: 21.5, 21.7) in the Union model. In the simulations, the baBIC method performed well by identifying on average the same predictors as in the example data 90.4% times for correlated outcomes.

Conclusions: Our method identified a common subset of variables to predict multiple clinical outcomes with superior parsimony and comparable accuracy to current methods.

Health Economics & Outcomes Research

backward elimination

Bayesian Information Criterion

prognostic models

survival analysis

variable selection

One of the first steps in building a regression model is selecting a subset of predictors from a pool of many available predictors. Clinicians and researchers alike desire a model that explains the data in the simplest way—namely, a parsimonious model—with appropriate predictive accuracy. Parsimonious models offer the potential to save the time it takes to gather unnecessary predictors, and expense, either in visit time or in money.

Most current model development methods focus on accurate and parsimonious prediction of single outcomes. Popular methodologies that are easy to use and interpret include stepwise methods like backward elimination or criterion-based selection like the Akaike Information Criterion (AIC) (1) or the Bayesian Information Criterion (BIC) (2). However, obtaining the most parsimonious and accurate model gets more complex for the simultaneous prediction of multiple outcomes which is a common scenario in clinical settings.

Several studies have demonstrated that older adults care not only about mortality, but also about their quality of life, specifically their ability to function independently (3, 4). In the realm of anticoagulation for atrial fibrillation, for example, clinicians may want to simultaneously predict risk of stroke and risk of a major gastrointestinal bleed (5, 6). In primary care, clinicians may want to balance risk of microvascular complications from diabetes against the risks of hypoglycemia and falls (7, 8). Yet, there is limited research on how best to develop clinical prognostic models that predict multiple outcomes simultaneously with accuracy and parsimony.

Much of the research on variable selection for multiple outcomes has been done in the high-dimensional multivariate regression setting, where the number of predictors and outcomes outweighs the number of observations. Under this setting, the implementation of shrinkage or regularization methods is common (9–11). Other authors have addressed variable selection for multivariate modelling using a Bayesian framework (12–14). However, in clinical settings, where the sample size is frequently large relative to the number of predictors and outcomes, a simpler and easy-to-implement procedure that does not require complex software solutions could be of great utility.

An obvious approach (which we label Individual Outcome method) to address the multiple outcomes problem is to simply select a different subset of variables to predict each of the outcomes using selection methods for single outcomes. Although straightforward, this method could be time-consuming, expensive (due to the cost of acquiring multiple predictors), and potentially lead to overfitting and high variability (9, 15).

A slight modification to this approach is the Union method. In this method, we take the separate models from the Individual Outcome method, and then force the union of the predictors from each model into the predictor set for each outcome. The online compendium of prognostic indices “ePrognosis” —freely available at ePrognosis.org (16–17)—receives over 3,000 users per week, and the most-used index is a Union model: the Combined Lee Schonberg Index, created from union of predictors from the Lee Index (18, 19) and the Schonberg index (20, 21). Like the Individual Outcome method, the Union method has the advantage of being a simple approach, and, additionally, it allows patients and clinicians to focus on a common subset of variables that can accurately predict their outcomes of interest simultaneously. Nevertheless, the Union model could lack parsimony as it includes all variables that predict all outcomes well, including those that are only important for some of the outcomes.

In this paper, we propose and evaluate a novel method for predictor selection in prognostic models of multiple clinical outcomes using the minimum average normalized BIC across outcomes. We name the resulting model the Best Average BIC (baBIC) model. To develop the proposed method, we use the Health and Retirement Study (HRS) data and a common set of health-related and demographic variables to predict time to: (1) Activities of Daily Living (ADL) Dependence, (2) Instrumental Activities of Daily Living (IADL) Difficulty, (3) Mobility Dependence, and (4) Death. We compare the parsimony and accuracy of this model with the models obtained using the Individual Outcome and the Union methods.

Case study: Health and Retirement Study data

We created a nationally representative cohort of 5,531 community-dwelling seniors enrolled in the HRS, who were 70 years old or older at the time of their baseline interview in 2000. The HRS is an ongoing longitudinal survey of a representative sample of all persons in the United States over age 50 that examines changes in health and wealth (22). It is sponsored by the National Institute on Aging (grant number NIA U01AG009740) and conducted by the University of Michigan. We used the public HRS data: Cross-Wave Tracker file (23) and RAND (24, 25) HRS data file.

The pool of predictors included 39 health-related and demographic variables measured at baseline. All the predictors were categorical variables. We used 4 clinical outcomes encompassing 15 years of follow-up: (1) time to first ADL dependence (including five ADLs: bathing, dressing, toileting, transferring, and eating), (2) time to first IADL difficulty (including two IADLs: managing money and medication), (3) time to first mobility dependence, and (4) time to death.

The Best Average BIC (baBIC) method

Our proposed method, the Best Average BIC (baBIC) method, selected the best subset of common predictors for all 4 outcomes based on the minimum averaged normalized BIC across outcomes. We compared our method with: (1) a method that selects individual subsets of predictors for each outcome (Individual Outcome method), and (2) an enhanced method that creates a best subset of common predictors based on the union of individual subsets obtained in the Individual Outcome approach (Union method).

Information criteria like the BIC are useful for selecting the best subset of predictors because they work well for both a fixed number of predictors and across predictor sets of varying sizes. In contrast, statistics like concordance (or C) statistic are not that useful for the selection across sets of different number of predictors since, in general, models with more predictors will tend to have higher C-statistic than those with fewer predictors. Another advantage of the BIC is that it will tend to select more parsimonious models since it penalizes larger models more heavily compared, for example, with the AIC.

The BICs were obtained from survival models. For time to death, we fitted Cox proportional hazards regression models (26). For times to first ADL dependence, IADL difficulty, and mobility dependence, we fitted Fine and Gray competing-risk regression models to appropriately account for the risk of death (27). In the baBIC method, we used a normalized BIC (nBIC) to ensure that a change in BIC from a complex to a simpler model meant roughly the same across multiple outcomes. The nBIC was computed by dividing the BIC from the fitted model of each outcome by the difference between the BIC in the full model and the BIC in the best individual model. That is:

See Formula 1 in Supplemental Files

Where:

See Formula 2 in Supplemental Files

L: the maximized value of the likelihood function of the fitted model

k: number of parameters estimated by the fitted model

The baBIC method started with all 39 (p) predictors and selected a subset of 38 (p-1) predictors with minimum average nBIC across 4 outcomes; the lower the nBIC the better the model. To select the subset of predictors with minimum average nBIC, we fitted for each outcome all possible combinations of predictors obtained by removing 1 predictor at a time. We then computed the average of the nBICs across the 4 outcomes within each subset of predictors and selected the subset of 38 (p-1) with the minimum average nBIC (Fig 1). In the next step of backward elimination, the method started with 38 (p-1) predictors and selected a subset of 37 (p-2) predictors that again rendered the minimum average nBIC across 4 outcomes. The same process continued until there were only 2 variables left (i.e. “Male” and “Age decile groups”), which were forced in. Lastly, the method selected the final subset of predictors that had the minimum average nBIC across all subsets of different number of predictors from p-1 to 2 (Fig. 2).

For the comparative methods, Individual Outcome and Union methods, we followed a similar approach as described above. The only difference being that the backward elimination was based on the minimum BIC for each individual outcome instead of the minimum average nBIC across the 4 outcomes. We then obtained the Union model that contained all the predictors that were in at least 1 of the 4 best subsets of the Individual Outcome models.

For all final models, we computed the number of variables and measured predictive accuracy using the Harrell’s C-statistic (28). For times to first ADL dependence, IADL difficulty, and mobility dependence, we used Wolbers et al. (29) adaptation of Harrell’s C-statistic to the competing risks setting, where death status is switched to censored and the time-to-event is equal to the longest possible time-to-event that any respondent was followed up (i.e. 15 years).

Simulation study

We performed a simulation study to assess the performance of the proposed baBIC method across highly correlated and uncorrelated data. We created 4 sets of 500 simulations with the same sample size (N=5,531) and the same number of initial predictors (p=39) with identical distribution as in the HRS data. The 4 outcomes in the HRS data were highly correlated based on the pairwise Pearson correlations (range: 0.80-0.91). Thus, to test whether the correlation among the outcomes impacted the selection methods, simulated data with high and low correlation among the outcomes were generated.

Due to the computation time constraint of fitting Competing-risk regression models in the simulation study for times to first ADL dependence, IADL difficulty, and mobility dependence, we used a modified version of the HRS data where those who died were treated as being censored at the longest possible time that any respondent was followed (i.e. 15 years) (29). Of note, we obtained the same final subset of predictors with and without this simplification in the HRS data.

The times-to-event of correlated outcomes were obtained as follows. First, we simulated 4-variate normal random variables that had means of zero, standard deviations of 1, and the correlation structure from the HRS data. Next, we inverted the random values to probabilities. For uncorrelated outcomes, we used probabilities simulated from the uniform distribution. These probabilities were then used as look-up values in the observed time-to-event distributions for each of the outcome variable in the HRS data.

For each of the simulated data, we obtained the final baBIC, Individual Outcome, and Union models. The averages and corresponding 95% Confidence Intervals (CI) of the Harrell’s C-statistic and the number of predictors were computed over 500 simulations for each model. Additionally, we calculated the percentage of times that each of the variables in the final baBIC model of the HRS data appeared in the baBIC models of the simulations (percentage of correct inclusion), and the percentage of times that each of the variables that were not present in the baBIC model of the HRS data did not appear in the baBIC models of the simulations (percentage of correct exclusion).

Fig. 2 shows the selection of the common subset of predictors of the Union model using the predictors in the 4 Individual Outcome models of the HRS data. The number of predictors in the Individual Outcome models ranged from 7 to 16. The Union model, which contained all the predictors found in at least 1 of the 4 Individual Outcome models, had 23 predictors, and most of them came from 1 or 2 Individual Outcome models. By contrast, the baBIC model with 15 predictors was more parsimonious than the Union model. These results were also confirmed in the simulation study. For simulations with correlated outcomes, the Union model had an average number of predictors of 21.6 (95% CI: 21.5, 21.7) compared with 13.8 (95% CI: 13.7, 13.9) in the baBIC model. For simulations with uncorrelated outcomes, the average number of predictors of the Union model was 21.9 (95% CI: 21.8, 22.0) vs. 14.1 (95% CI: 14.0, 14.1) in the baBIC model (Table 1).

In the HRS data and simulations with correlated and uncorrelated outcomes, the C-statistics of the Individual Outcome, Union, and baBIC models were clinically similar within each outcome. The average C-statistics across outcomes of the Union and the baBIC models were also comparable in the HRS data (0.662 vs. 0.657 respectively). In the simulations, the average predictive accuracies of the Union and baBIC models were the same after rounding: 0.650 (Table 2).

As shown in Table 1, the average number of predictors of the baBIC models obtained across simulations with correlated (13.8; 95% CI: 13.7, 13.9) and uncorrelated outcomes (14.1; 95% CI: 14.0, 14.1) were slightly smaller than the 15 predictors obtained in the baBIC model of the HRS data. However, the average C-statistics of the baBIC models of simulations with correlated and uncorrelated outcomes were very similar to the C-statistic of the baBIC model of the HRS data (0.650 vs 0.657) (Table 2).

When using the baBIC method in the simulations, most of the predictors present in the baBIC model of the HRS data were correctly identified (percentage of correct inclusion) 80% of the times or more. On average, this method selected the same predictor as in the HRS data 90.4% of the times for correlated outcomes and 92.7% of the times for uncorrelated outcomes. All 15 predictors selected in the baBIC model of the HRS data were correctly included in the models obtained from the simulations 20.8% and 32.0% of the times for correlated and uncorrelated outcomes respectively (Table 3). The average percentage of correct exclusion of the predictors that were not present in the baBIC model of the HRS data was almost 100% (results not shown).

The baBIC selection method produced a model with a good balance between parsimony and predictive accuracy. In both the HRS data and the simulations, this model was more parsimonious than the Union model, and it showed minimal loss of predictive discrimination. A good compromise between parsimony and accuracy is important since models that are simpler to understand and explain and that predict outcomes well are more likely to be implemented. Models with too few predictors cannot adequately describe the relationship between outcomes and predictors, whereas those with too many predictors can cause overfitting problems. Moreover, as the number of predictors in the model increases, the time and cost of collecting them could also increase. From a practical perspective, busy clinicians are unlikely to use a prognostic model with a daunting list of predictors to collect and enter. Although we did not formally incorporate a penalization associated with the cost of the predictors, other authors have explicitly balanced predictive accuracy against cost of the predictors (12).

In the simulations, we found that the baBIC method performed well by selecting on average a high percentage of the predictors included in the final baBIC model of the HRS data (i.e. high percentage of correct inclusion), while keeping the percentage of correct exclusion almost 100%. The average number of predictors of the final models of the simulations was slightly smaller than that of the final models of the HRS data. This could be explained because our attempt to replicate the structure of the HRS data did not entirely capture its complexity. Despite this, the average C-statistic of the final models of the simulations and the HRS data were clinically similar.

Breiman and Friedman (30) considered the relationship between the outcomes to improve predictive accuracy. In this way, several studies have developed methods for variable selection explicitly accounting for the correlation among multiple outcomes (10, 11, 13, 31, 32). In our method, we did not include the between-outcomes correlation. However, by averaging the normalized BIC across outcomes and selecting the best subset of predictors based on the minimum average normalized BIC, we pooled evidence across outcomes and implicitly incorporated their relationships. Furthermore, in the simulation study, we found that the percentage of correct inclusion of predictors (compared with the predictors selected in the baBIC model of the HRS data) was similar in simulations with correlated and uncorrelated outcomes. These findings suggest that during variable selection a similar subset of predictors can be obtained regardless of the correlation among the outcomes.

As noted before, several studies have used penalized regression under the high dimensional multivariate regression setting, where the numbers of predictors and outcomes may be large compared to the sample size. Regularization methods are particularly suitable for the study of genetic pathways or genome-wide association analysis, where high dimension, low sample size settings are very common (10, 15, 31, 32). In clinical settings, researchers are usually interested in interpretable effect estimates in addition to good predictive performance. Regression coefficients estimated by regularization schemes like those that are an extension of the Least Absolute Shrinkage and Selection Operator (LASSO) (33) can be biased, making their interpretation more difficult (34). Consequently, we believe that in the clinical practice where the sample size is usually large compared with the number of outcomes and predictors, our baBIC method, which extends the use of popular (non-regularized) variable selection methods to the multivariate settings, has the benefit of easier implementation and interpretation as well as good predictive performance.

As in our study, other authors have extended stepwise methods and criterion-based selection to multivariate settings (35–37). Our method differs in that we combined backward elimination with a criterion-based selection method like the BIC. By doing this, we improved computational efficiency by not fitting all possible models—as traditional criterion-based selection methods do—while maintaining predictive performance by using BIC instead of statistical significance (i.e. stepwise methods), which does not always indicate predictive value (38).

More recently, a clinical study identified a common set of predictors across several adverse outcomes (39). The authors identified the predictors that were significantly associated with most or all the outcomes, one of them being a composite of the other outcomes. This method is simple to implement and allows optimizing clinical resources by focusing on a single-combined outcome. However, the authors relayed heavily on statistical significance to identify the common set of predictors, whereas our approach focuses on selecting the best subset of predictors based on the minimum average normalized BIC across outcomes.

It is worth mentioning that our method focused on one of the first steps of building a regression model. That is, we aimed to select a common subset of variables from a pool of many available predictors rather than identifying a final predictive model. Thus, we assumed that all aspects of model building are fixed, except the selection of the predictors. In the actual application of this method, researchers will need to consider the rest of the aspects involved in model building. For example, possible inclusion of non-linear terms, interaction and multicollinearity between predictors, and for survival models, validity of the proportional hazard assumption. Additionally, it will be important to assess the performance of the final model using both calibration and discrimination techniques as well as conducting model validation.

Our baBIC method implemented a straightforward approach to obtain a common set of variables for the prediction of several outcomes. By selecting a common set of predictors for multiple clinical outcomes, researchers will be able to build prognostic models that are both accurate and parsimonious, potentially saving the clinical time and expense associated with gathering additional unnecessary predictors. Although the method shown here was developed for censored survival data using BIC as a convenient statistic for selection, it could easily be extended to generalized linear models or other information criteria such as the AIC. Finally, this method can potentially be applied to larger data with a greater number of predictors and outcomes. As the number of predictors and outcomes increases, there would be some computational challenges particularly for Competing-risk survival models that have longer run time than other regression models.

Ethics approval and consent to participate

Before each interview, HRS participants are provided with a written informed consent information document and give oral consent for their participation in the HRS. The institutional review boards of the University of California, San Francisco, and the San Francisco Veterans Affairs Medical Center approved the present study.

Consent for publication

Not applicable

Availability of data and materials

The datasets and the SAS programs used during the current study are available from the corresponding author on reasonable request.

Competing interests

The authors declare that they have no competing interests.

Funding

This work was supported by the National Institute on Aging (grant numbers: R01 AG047897, R01 AG057751). The funding agency did not participate in the design of the study, collection, analysis, interpretation of data, or in writing the manuscript.

Authors' contributions

LGDR drafted the manuscript, had full access to all the data in the study and took responsibility for the integrity of the data and the accuracy of the data analysis. SJL, AKS, and WJB designed the study and directed its implementation. SG helped on the acquisition and analysis of the data. All authors read and approved the final manuscript.

Acknowledgments

The authors thank Regina Anavy for proofreading the article.

AIC: Akaike Information Criterion

ADL: Activities of Daily Living

baBIC: Best Average BIC model

BIC: Bayesian Information Criterion

CI: Confidence Intervals

C-statistic: Concordance statistic

HRS: Health and Retirement Study

IADL: Instrumental Activities of Daily Living

nBIC: normalized BIC

Akaike H. Information theory and an extension of the maximum likelihood principle. In: Petrov BN, Csaki F, editors. Second international symposium on information theory. Budapest, Hungary: Akadémiai Kiado;1973. p. 267–281. https://link.springer.com/chapter/10.1007/978-1-4612-1694-0_15.
Schwarz G. Estimating the dimension of a model. Ann Statist 1978; 6(2):461-464. http://doi.org/10.1214/aos/1176344136.
Steinhauser KE, Christakis NA, Clipp EC, McNeilly M, McIntyre L, Tulsky JA. Factors considered important at the end of life by patients, family, physicians, and other care providers. JAMA 2000; 284(19):2476-2482. https://doi.org/10.1001/jama.284.19.2476.
Fried TR, Bradley EH, Towle VR, Phil M, Allore H. Understanding the treatment preferences of seriously ill patients. N Engl J Med 2002; 346(14):1061-1066. https://doi.org/10.1056/NEJMsa012528.
Singer DE, Chang Y, Fang MC, et al. The net clinical benefit of warfarin anticoagulation in atrial fibrillation. Ann Intern Med 2009; 151(5):297-305.https://doi.org/10.7326/0003-4819-151-5-200909010-00003.
Fang MC, Go AS, Chang Y, et al. A new risk scheme to predict warfarin-associated hemorrhage. J Am Coll Cardiol 2011; 58(4):395-401. https://doi.org/10.1016/j.jacc.2011.03.031.
Kirkman MS, Briscoe VJ, Clark N, et al. Diabetes in older adults: a consensus report. J Am Geriatr Soc 2012; 60(12):2342-2356. https://doi.org/10.1111/jgs.12035.
American Geriatrics Society Expert Panel on Care of Older Adults with Diabetes Mellitus, Moreno G, Mangione CM, Kimbro L, Vaisberg E. Guidelines abstracted from the American Geriatrics Society Guidelines for Improving the Care of Older Adults with Diabetes Mellitus: 2013 update. J Am Geriatr Soc. 2013; 61(11):2020–2026. https://doi.org/10.1111/jgs.12514.
Turlach BA, Venables WN, Wright SJ. Simultaneous variable selection. Technometrics 2005; 47(3):349–363. https://doi.org/10.1198/004017005000000139.
Kim S, Sohn K-A, Xing EP. A multivariate regression approach to association analysis of quantitative trait network. Bioinformatics 2009; 25(12):i204–i212. https://doi.org/10.1093/bioinformatics/btp218.
Rothman AJ, Levina E, Zhu J. Sparse multivariate regression with covariance estimation. J Comput Graph Statist 2010; 19(4):947-962. https://doi.org/10.1198/jcgs.2010.09188.
Brown PJ, Fearn T, Vannucci M. The choice of variables in multivariate regression: A non-conjugate Bayesian decision theory approach. Biometrika 1999; 86(3):635–648. https://doi.org/10.1093/biomet/86.3.635.
Lee KH, Tadesse MG, Baccarelli AA, Schwartz J, Coull BA. Multivariate Bayesian variable selection exploiting dependence structure among outcomes: Application to air pollution effects on DNA methylation. Biometrics 2016; 73(1):232–241. http://doi.org/doi:10.1111/biom.12557.
Kundu D, Mitra R, Gaskins JT. Bayesian Variable Selection for Multi-Outcome Models Through Shared Shrinkage. Scand J Stat 2019. https://arxiv.org/abs/1904.11594v1.
Peng J, Zhu J, Bergamaschi A, et al. Regularized Multivariate Regression for Identifying Master Predictors with Application to Integrative Genomics Study of Breast Cancer. Ann Appl Statist 2010; 4:53–77. http://doi.org/10.1214/09-AOAS271SUPP.
University of California San Francisco: Repository of published geriatric prognostic indices, https://www.eprognosis.org/; 2019 [accessed 3 May 2019].
Yourman LC, Lee SJ, Schonberg MA, Widera EW, Smith AK. Prognostic indices for older adults. A systematic Review. JAMA 2012; 307(2):182-192. https://doi.org/10.1001/jama.2011.1966.
Lee SJ, Lindquist K, Segal MR, Covinsky KE. Development and validation of a prognostic index for 4-year mortality in older adults. JAMA 2006; 295(7):801-808. https://doi.org/10.1001/jama.295.7.801.
Cruz M, Covinsky K, Widera EW, Stijacic-Cenzer I, Lee SJ. Predicting 10-Year Mortality for Older Adults. JAMA 2013; 309(9):874-876. https://doi.org/10.1001/jama.2013.1184.
Schonberg MA, Davis RB, McCarthy EP, Marcantonio ER. Index to predict 5-year mortality of community dwelling adults aged 65 an older using data from the National Health Interview Survey. J Gen Intern Med 2009; 24(10):1115-1122. https://doi.org/10.1007/s11606-009-1073-y.
Schonberg MA, Davis RB, McCarthy EP, Marcantonio ER. External validation of an index to predict up to 9-year mortality of community-dwelling adults aged 65 and older. J Am Geriatr Soc 2011; 59(8):1444-1451. https://doi.org/10.1111/j.1532-5415.2011.03523.x.
Sonnega A, Faul JD, Ofstedal MB, Langa KM, Phillips JW, Weir DR. Cohort profile: the Health and Retirement Study (HRS). Int J Epidemiol 2014; 43(2):576-585. https://doi.org/10.1093/ije/dyu067.
Health and Retirement Study, (Cross-Wave Tracker File 2014 Final, Version 1.0) public use data set. Produced and distributed by the University of Michigan with funding from the National Institute on Aging (grant number NIA U01AG009740). Ann Arbor, MI, (2017).
Health and Retirement Study, (RAND HRS Data, Version P) public use data set. Produced and distributed by the University of Michigan with funding from the National Institute on Aging (grant number NIA U01AG009740). Ann Arbor, MI, (2016).
RAND HRS Data, Version P. Produced by the RAND Center for the Study of Aging, with funding from the National Institute on Aging and the Social Security Administration. Santa Monica, CA (August 2016).
Cox DR. Regression models and life tables. J R Stat Soc Series B 1972; 34(2):187-220. https://www.jstor.org/stable/2985181.
Fine JP, Gray RJ. A proportional hazards model for the subdistribution of a competing risk. J Am Stat Assoc 1999; 94(446):496-509. https://doi.org/10.1080/01621459.1999.10474144.
Harrell FE. The PHGLM Procedure. In: SUGI Supplemental Library Users Guide; 1986 Version 5 Edition:437-466. SAS Institute Inc., Cary, NC.
Wolbers M, Koller MT, Witteman JC, Steyerberg EW. Prognostic models with competing risks: methods and application to coronary risk prediction. Epidemiology 2009; 20(4):555-561. https:/doi.org/10.1097/EDE.0b013e3181a39056.
Breiman L, Friedman JH. Predicting multivariate responses in multiple linear regression. J R Statist Soc Series B 1997; 59(1):3-54. https://doi.org/10.1111/1467-9868.00054.
Sofer T, Dicker L, Lin X. Variable selection for high dimensional multivariate outcomes. Stat Sin 2014; 24(4):1633-1654. http://doi.org/10.5705/ss.2013.019.
Zhang H, Zheng Y, Yoon G, et al. Regularized estimation in sparse high-dimensional multivariate regression, with application to a DNA methylation study. Stat Appl Genet Mol Biol 2017; 16(3):159–171. https://doi.org/10.1515/sagmb-2016-0073.
Tibshirani R. Regression shrinkage and selection via the lasso. J Roy Statist Soc Ser B 1996; 58(1):267-288. www.jstor.org/stable/2346178.
Heinze G, Wallisch C, Dunkler D. Variable selection - A review and recommendations for the practicing statistician. Biom J 2018; 60(3):431–449. http://doi.org/10.1002/bimj.201700067.
Bedrick EJ, Tsai C. Model Selection for Multivariate Regression in Small Samples. Biometrics. 1994; 50(1):226–231. http://doi.org/10.2307/2533213.
Fujikoshi Y, Satoh K. Modified AIC and Cp in Multivariate Linear Regression. Biometrika 1997; 84(3):707–716. https://doi.org/10.1093/biomet/84.3.707.
Al-Subaihi AA. Variable Selection in Multivariable Regression Using SAS/IML. J Stat Softw 2002; 07(12). http://doi.org/10.18637/jss.v007.i12.
Lo A, Chernoff H, Zheng T, Lo SH. Why significant variables aren’t automatically good predictors. PNAS 2015; 112(45):13892-13897.https://doi.org/10.1073/pnas.cm10313
Kabue S, Liu V, Dyer W, Raebel M, Nichols G, Schmittdiel J. Identifying Common Predictors of Multiple Adverse Outcomes Among Elderly Adults With Type-2 Diabetes. Med Care 2019; 57(9): 702-709.

Table 1. Comparison of Number of Predictors Using HRS Data and Simulations with Correlated and Uncorrelated Outcomes

Data	Individual Outcome Models				Union Model	baBIC Model
Data	Time to first ADL dependence	Time to first IADL difficulty	Time to first mobility dependence	Time to death	Union Model	baBIC Model
HRS data, Number of Predictors	10.0	9.0	7.0	16.0	23.0	15.0
	Simulations, Number of predictors Mean [95% CI]
With correlated outcomes	8.5 [8.4-8.6]	7.8 [7.7-7.9]	6.0 [5.9-6.1]	15.0 [14.9-15.1]	21.6 [21.5-21.7]	13.8 [13.7-13.9]
With uncorrelated outcomes	8.5 [8.4-8.5]	7.8 [7.7-7.8]	6.0 [5.9-6.1]	14.9 [14.8-15.0]	21.9 [21.8-22.0]	14.1 [14.0-14.1]

Legend:

Individual Outcome Model: Best subset of predictors based on the minimum BIC for each individual outcome

Union Model: Subset of all the predictors that were in at least 1 of the 4 best subsets of predictors based on the minimum BIC for each individual outcome

baBIC Model: Best Average BIC model, best subset of predictors based on the minimum average normalized BIC across the 4 outcomes

ADL: Activities of Daily Living

BIC: Bayesian Information Criterion

HRS: Health and Retirement Study

IADL: Instrumental Activities of Daily Living

Table 2. Comparison of Harrell’s C-statistic Using HRS Data and Simulations with Correlated and Uncorrelated Outcomes

Outcome	Data	Individual Outcome Model	Union Model	baBIC Model
Outcome	Data	Individual Outcome Model	Union Model	baBIC Model
Time to first ADL dependence	HRS data, C-statistic [95% CI]	0.639 [0.627-0.652]	0.648 [0.635-0.660]	0.641 [0.629-0.654]
		Simulations, C-statistic Mean [95% CI]
	With correlated outcomes	0.623 [0.622-0.624]	0.627 [0.626-0.627]	0.631 [0.631-0.632]
	With uncorrelated outcomes	0.623 [0.623-0.624]	0.627 [0.627-0.628]	0.631 [0.630-0.631]
Time to first IADL difficulty	HRS data, C-statistic [95% CI]	0.636 [0.623-0.648]	0.638 [0.626-0.650]	0.635 [0.623-0.647]
		Simulations, C-statistic Mean [95% CI]
	With correlated outcomes	0.626 [0.625-0.626]	0.629 [0.629-0.630]	0.627 [0.627-0.628]
	With uncorrelated outcomes	0.625 [0.625-0.626]	0.629 [0.628-0.629]	0.627 [0.626-0.627]
Time to first mobility dependence	HRS data, C-statistic [95% CI]	0.635 [0.618-0.652]	0.649 [0.632-0.665]	0.643 [0.627-0.660]
		Simulations, C-statistic Mean [95% CI]
	With correlated outcomes	0.631 [0.630-0.631]	0.637 [0.636-0.638]	0.640 [0.639-0.641]
	With uncorrelated outcomes	0.631 [0.630-0.632]	0.638 [0.637-0.638]	0.640 [0.639-0.641]
Time to death	HRS data, C-statistic [95% CI]	0.711 [0.703-0.719]	0.712 [0.704-0.720]	0.709 [0.701-0.717]
		Simulations, C-statistic Mean [95% CI]
	With correlated outcomes	0.705 [0.705-0.706]	0.706 [0.705-0.706]	0.702 [0.702-0.703]
	With uncorrelated outcomes	0.705 [0.705-0.705]	0.705 [0.705-0.706]	0.702 [0.702-0.703]
Mean of 4 Outcomes	HRS data, C-statistic		0.662	0.657
		Simulations, C-statistic Mean
	With correlated outcomes		0.650	0.650
	With uncorrelated outcomes		0.650	0.650

Legend:

Individual Outcome Model: Best subset of predictors based on the minimum BIC for each individual outcome

Union Model: Subset of all the predictors that were in at least 1 of the 4 best subsets of predictors based on the minimum BIC for each individual outcome

baBIC Model: Best Average BIC model, best subset of predictors based on the minimum average normalized BIC across the 4 outcomes

ADL: Activities of Daily Living

BIC: Bayesian Information Criterion

HRS: Health and Retirement Study

IADL: Instrumental Activities of Daily Living

Table 3. Percentage of Correct Predictor Inclusion in baBIC model using Simulations with Correlated and Uncorrelated Outcomes

Predictor in baBIC Model of HRS Data	Percentage of Correct^a Inclusion
Predictor in baBIC Model of HRS Data	Simulations with correlated outcomes	Simulations with uncorrelated outcomes
Age deciles groups^b	100.0	100.0
Male^b	100.0	100.0
Number of words from 10-word list recalled correctly after 5 minutes	100.0	100.0
Diabetes	100.0	100.0
Whether able to drive	89.8	98.0
Education 12+ years	77.0	77.6
Exercise frequency	92.2	96.6
Heart failure	98.2	97.2
Incontinence	73.6	88.8
Chronic lung disease	75.0	78.2
Having difficulty climbing stairs	94.0	97.6
Having difficulty with pushing large objects	93.2	95.8
BMI quintile groups	83.4	77.4
Whether smokes	100.0	100.0
Whether helps as volunteer	98.8	98.0
	Average Percentage of Correct^a Inclusion
	90.4	92.7
	All 15 predictors included
	20.8	32.0

Legend:

baBIC Model: Best Average BIC model, best subset of predictors based on the minimum average normalized BIC across the 4 outcomes

Correct^a: it is defined compared to subset of predictors in baBIC model of HRS data

Age deciles groups^b, Male^b: predictors that are forced into all models

Download PDF

Journal Publication

published 27 Mar, 2021

Read the published version in Computer Methods and Programs in Biomedicine →

Version 1

posted

You are reading this older preprint version

Read the latest preprint version →

A Novel Method for Identifying a Parsimonious and Accurate Predictive Model for Multiple Clinical Outcomes

Status:

Journal Publication

Version 1

Abstract

Figures

Background

Methods

Results

Discussion

Conclusions

Declarations

Abbreviations

References

Tables

Supplementary Files

Status:

Journal Publication

Version 1