Developing cardiometabolic risk classifiers for youth using handgrip strength, anthropometrics, and demographics: a machine learning approach leveraging National Health and Nutrition Examination Survey Data

Background

Muscular fitness (i.e., strength, endurance and power) is the capacity to actuate work against one’s body weight or other external resistance for a relatively sustained duration [1]. Muscular fitness is commonly evaluated using surrogates, including handgrip strength, push-up, and curl-up. Skeletal muscle is integral to protein synthesis throughout the body. As such, altered muscle metabolism often underlies many common pathologies and chronic diseases like diabetes [2]. Muscular fitness is linked with cardiorespiratory fitness, weight status, and cardiovascular disease risk [3-5]. Particularly, normalized handgrip strength (a surrogate for muscular fitness) inversely associates with clustered cardiometabolic (CMB) risk, including central adiposity and hypertension in youth [6-8]. However, despite the important role CMB risk clustering monitoring could play in identifying, preventing and controlling disease onset, current youth health-related physical fitness (HRPF) assessments like the so called FITNESSGRAM have yet to leverage features such as handgrip strength and machine learning techniques to streamline HRPF testing and consolidate chronic disease risk screenings in schools. Consequently, these assessments do not explicitly predict CMB risk and remain highly time consuming and burdensome to physical education (PE) teachers.

Optimal muscular fitness and muscle metabolic functioning are critical to preventing chronic diseases. Texas passed Senate Bill (SB) 530 in 2007 requiring annual HRPF fitness assessments of public school students across third through twelfth grades (using FITNESSGRAM®^®protocols) [9]. However, there is evidence some parents and children have reservations about the test [10]. Specifically, 26% of surveyed PE teachers reported negative impressions of physical fitness testing in Texas schools, citing the fact that performing test items in front of peers engendered anxiety, teasing, and taunting by peers (both for poor performance and trying hard). Some students missed school to avoid testing, and parents called and sent notes to request their children be excluded from testing owing to teasing. Some children cried owing to what they perceived as poor performance on HRPF tests (e.g., inability to complete a push-up). Therefore, compounded by persistent issues such as low recognition of PE as a course among school personnel and teachers withholding students from PE as punishment or for additional instruction time, it is unclear whether programs like FITNESSGRAM®^® testing actually inform policies to promote physical activity and school health services [11, 12].

Hispanic/Latino children are less active at home and during recess at school (compared to their White peers) [13-16]. In fact, they averaged only 35 minutes of moderate-to-vigorous physical activity (MVPA), which is considerably lower than the recommended 60 or more daily minutes of MVPA for children and youth [17, 18]. Hispanic/Latino children are disproportionately affected by obesity and chronic diseases such as diabetes [19-21]. Further, there was pervasive co-prevalence of muscular fitness deficits (42.3%) and overweight and obesity (40%) among Hispanic/Latino youth in Corpus Christi, Texas[22]. Hispanic/Latino youth are more likely to have high abdominal adiposity, elevated triglycerides, and increased chronic disease risk [19-21, 23]. Ironically, Hispanic/Latino youth were provided fewer school-based health services, including identification of chronic health conditions, student tracking over time, and referral to community-based care [24]. Because cardiovascular events and related morbidity are relatively rare in children, CMB indicators are commonly used to assess risk [25]. Studies have used clustering of a range of CMB markers, including systolic blood pressure, waist circumference, total cholesterol, insulin resistance, and triglycerides to assess CMB risk in children and youth [25, 26].

Like many states, Texas schools screen for Type 2 diabetes by evaluating students for acanthosis nigricans, a dermatologic hyperpigmentation manifestation that sometimes results from hyperinsulinemia and insulin resistance [27]. However, this screening likely misses nascent (i.e., no physical manifestations) metabolic disease risk factors like dyslipidemia and hypertension. Considering the issues around current HRPF testing, it is imperative to examine additional surrogates that are efficient and tractable in schools, and develop classifiers that explicitly predict CMB risk, especially in medically underserved communities where many children may be uninsured/underinsured and not routinely seen by a pediatrician. Considering the need for early prediabetes detection [28], it is critical to examine any potential contributions of race and other demographics on predicting and classifying CMB risk.

Machine learning approaches have been commonly applied to classification problems involving predicting disease risk owing to their capacity to leverage several different methods and identify multivariate interactions and patterns that are optimally predictive of specified endpoints [29]. Supervised learning has been used to classify fundamental locomotor skills (e.g., hopping, running, etc.) [30], activity type (e.g., walking, standing stationary, etc.), predicting physical activity patterns in older adults [31], and classifying obesity among youth [32]. Although studies have approached prediction problems using predetermined classification methods, inherent peculiarities around shared contexts (e.g., ecological and sociocultural) suggest it may be important to explore several models and identify the best performing ones for the specific problem and dataset. Further, there are no theoretical methods to determine the sample size required to effectively train machine learning models [33]. These dataset attributes underlie variations in performances between different classification algorithms and methods. For example, prior research found that Decision Tree and Support Vector Machine (SVM) outperformed Bayesian and Neural Networks at classifying childhood obesity using features that included push-up test, partial curl-up, and step-up in 12-year-old Malaysian children [32]. Similarly, Decision Tree outperformed Bayesian methods at classifying obesity in children after age two years [34]. Relatedly, it is advantageous to leverage different feature selection techniques [35] (e.g., filter, wrapper, and ensemble methods), because it optimizes the range of important feature combinations and decreases model complexity [36].

The problem of developing classifiers from datasets with imbalanced classes has gained some attention in the literature. Broadly, a dataset is described as imbalanced, if the discrete categories to be classified are not roughly equally represented in the dataset. Significant underrepresentation of the minority class can skew model learning and result in poor accuracy for predicting the minority class [37]. While a few different methods have been recommended to address class imbalance, under-sampling invariably implies loss of data, which does not seem optimal, especially when the dataset is low-dimensional and has relatively diminutive data points to begin with. Synthetic Minority Oversampling Technique (SMOTE) has been applied to data imbalance problems [29, 38]. Depending on their inherent rules, different classification algorithms (e.g., Naïve Bayes) may tolerate data imbalance, whereas SMOTE-augmented data, which decreases variability between observations may decrease minority class prediction performance metrics [37, 39].

It is widely recognized that it is critical to promote muscular fitness in children through moderate-to-vigorous physical activity [40] and deliberate integrative neuromuscular training, which has been shown to improve muscular fitness by increasing neuromuscular capacity and physical/motor competence in youth [41]. However, there are issues around HRPF testing, including the apparent emphasis on performance, which can foment peer shaming and dissuade student participation.

The purpose of this study was to examine the feasibility of developing highly accurate models to predict and classify CMB risk in youth using features across demographic, anthropometric, and handgrip strength (i.e., muscular strength) data in a nationally representative youth sample.

Methods

Participants

This study leveraged cross-sectional data from the 2011-2014 National Health and Nutrition Examination Survey (NHANES). Data was collected around the United States through electronic surveys and at mobile examination centers. National Center for Health Statistics personnel collected data in periodic cycles across May 1 through October 31 and November 1 through April 30 from 2011-2014. A total of 19,346 original participant records were screened and delimited by age. Of this sample, 8,322 were between ages 0-19 years. However, blood was only drawn from participants aged 12 years and older and tested during a morning session. Therefore, only 402 records of participants aged 8-18 years had associated demographics data, CMB markers, and muscular strength data. Texas A&M University-Corpus Christi Institutional Review Board approved this study (TAMU-CC-IRB-2020-02-026).

Procedures

Anthropometrics

Standing height and body weight were measured to the nearest 0.1 cm and 0.1 kg, respectively. Standardized BMI z-scores were then calculated to determine respective percentiles for age and sex according to the Centers for Disease Control and Prevention (CDC) BMI-for-age growth charts [42]. Underweight, healthy weight, overweight, and obesity were defined as BMI < 5th percentile, 5th ≤ BMI < 85th percentile, 85th ≤ BMI < 95th percentiles, and BMI ≥ 95th percentile, respectively [42, 43]. Waist circumference was measured as the distance around the waist (using a pre-marked reference point that coincides with the iliac crest) to the nearest 0.1 cm at the end normal expiration during standing using a retractable steel measuring tape. Sagittal abdominal diameter was measured as the distance around the waist (using a pre-marked reference point that coincides with the iliac crest) to the nearest 0.1 cm at the end normal expiration while participants lay supine using a Holtain-Kahn caliper. Additional details of procedures for anthropometrics data collection are provided in the NHANES Anthropometry Procedures Manual [44].

Cardiometabolic Measures

Blood was collected by a trained phlebotomist in a minimum 9-hour fasted state. Blood specimens were initially processed and stored by refrigeration (-30^oC) and subsequently sent to University of Minnesota, Minneapolis, MN for analysis. Details of laboratory quality assurance and monitoring are previously outlined [45]. Blood lipids, fasting blood glucose, and insulin were measured. Additional details of procedures for CMB measures are provided in the NHANES Anthropometry Procedures Manual [44]. Homeostatic model assessment of insulin resistance (HOMA-IR) (i.e., insulin sensitivity) was implemented using HOMA2 Calculator (Oxford, England) [46].

CMB risk was delineated as having a cluster of three risk factors across factors, namely mean systolic, mean diastolic, HDL-cholesterol (mg/dL), LDL-cholesterol (mg/dL), total cholesterol (mg/dL), insulin (mg/dL), triglycerides (mg/dL), and fasting glucose (mg/dL). Systolic blood pressure less than 120 is normal, 120 to 139 is prehypertension, and greater than 139 is hypertension [47]. Similarly, diastolic blood pressure less than 80 is normal, 80 to 89 is prehypertension, and greater than 89 is hypertensive [47]. Total cholesterol less than 200 mg/dL is normal, 200 to 239 mg/dL is borderline high, and greater than or equal to 240 mg/dL is considered high [47]. HDL greater than 45 mg/dL is normal, 40 to 45 mg/dL is borderline low, and less than 40 mg/dL is low [48, 49]. LDL less than 110 mg/dL is normal, 110 to 130 mg/dL is borderline high, and greater than 130 mg/dL is high. Triglycerides less than 90 mg/dL is normal, 90 to 129 mg/dL is borderline high, greater than 130 mg/dL is high [49]. Glucose 3.0 to 25.0 mmol/L and Insulin 20 to 400 pmol/L are considered normal. Because HOMA IR does not have a universally agreed especially among youth, a score equal or greater than the 90^th percentile (i.e., 27) of the current sample was considered high. Because objective scans of body fat content were not available in the original NHANES dataset, obesity (determined using CDC Growth Charts) was deemed an additional CMB risk factor [28] such that observations with two individual risk factors across the lipid, blood pressure, glucose and insulin profiles were deemed to have CMB risk, if they were obese. This increased the percentage of the sample with CMB risk from the initial 12% to 28%.

Handgrip Strength

Muscle strength was examined using the NHANES handgrip test developed in collaboration with the National Cancer Institute designed to provide nationally representative data on muscle strength, so that associations between muscle strength and risk factors such as obesity and CMB risk can be studied. The isometric grip strength test was administered using a Takei T.K.K.5401 Digital Grip Strength Dynamometer TKK 5401 Grip-D; Takei, Niigata, Japan. After calibrating the handgrip dynamometer and adjusting the device for grip size, participants were asked to squeeze a as hard as possible with each hand in a standing or seated position. For the handgrip test, participants were instructed to grasp a dynamometer between the fingers and palm at the base of the thumb, stand upright with the feet shoulder width apart, and maintain a neutral wrist with the device pointing downwards (at the level of the thigh) without touching the body. Participants were instructed to look straight ahead, inhale prior to squeezing, squeeze with the palm facing the thigh, and exhale while squeezing. To ensure maximal effort, participants were instructed to squeeze as hard as they could until they could not squeeze any harder. Each hand was tested three times, and the hands were alternated, thereby resulting in 1 minute of rest on each hand. Efforts were adjudged to be maximal, if squeezing was observably accompanied by slight shaking. Although all participants aged 6 years and older were tested, only participants aged 12-18 years without prior hand or wrist surgery who stood unassisted for the duration of test were included in this study. Further, participants were excluded, if they indicated any hand pain or sat during the muscle strength testing. Participants were also excluded, if they were unable to flex the second interphalangeal joint on their index finger (on the hand being tested) to 90^o.

Data Analysis

The 2011-2014 NHANES transport files were accessed in February 2020 by downloading the SAS Universal Viewer (SAS, Cary, NC) and saving the associated data as a CSV file. Further data reduction and processing were done in EXCEL (Microsoft Corporation, Redmond, WA) and MATLAB R2019b (Mathworks, Natick, MA). There were 16 initial features namely gender, age (in years), race, number of people in the household, number of people in the family, number of children 5 years or younger, number of children 6-17 years, annual household income, annual family income, ratio of family income to poverty, body weight (kg), height (cm), BMI (kg/m2), waist circumference, average sagittal abdominal diameter, and combined handgrip strength. Previously, while handgrip strength did not associate, handgrip strength normalized by body weight and BMI both associated with metabolic syndrome in male and female adults [50]. Therefore, combined handgrip strength was normalized to body weight and BMI in this study, thereby resulting in 18 total features. Missing data points were imputed using the median score of the respective weight class, age and gender. Categorical variables, namely gender and race were maintained as discretized in the original dataset (i.e., male = 1; female = 2). There were 402 eligible records (298 negative and 104 positive cases). Twenty percent of the dataset (i.e., 40 positive and 40 negative records) was separated as the test set (i.e., for further internal validation). All 18 predictors (i.e., features) were recursively combined and their capacity to separate the classes visually examined using scatter plots.

In this study, “0” represented the negative class (i.e., “Not At Risk” for CMB disease) and “1” represented the positive class (i.e., “At Risk” for CMB disease). Approximately 72% of the total original observations did not have CMB risk. Such imbalance in the distribution of target classes can adversely impact the performance of classification models [38]. Also, considering that the cost of misclassifying observations with CMB risk as “Not At Risk” far exceeds that of the reverse error, it was important to oversample the minority class to mitigate any potential effects of data imbalance on model training with the original dataset. Therefore, the Synthetic Minority Over-Sampling Technique (SMOTE) was implemented [29, 38, 39]. SMOTE simply generates new data points by multiplying the Euclidean distance between a reference data point and its nearest neighbors in space by a random number between 0 and 1 and adding the resulting vector to the original (i.e., non-synthetic) data points [39]. Considering the class distribution ratio of 4:1 (i.e., 258 positive to 64 negative class records) in the training set, the Synthetic Minority Oversampling Technique (SMOTE) package was implemented in Python 3.7 (Python Software Foundation, Wilmington, DE) to resolve the imbalance. Specifically, SMOTE was used to synthetically generate data points using nearest neighbors. As such, 64 positive cases (minority class) was oversampled by 400%. This resulted in 257 minority class observations and a total of 514 balanced records.

Features were narrowed down to the five most salient using three different feature selection methods, i.e., filter (SelectKBest), wrapper (Recursive Feature Elimination), and embedded (Random Forest) [36] (Table 2). The respective feature selection packages were implemented in Python. Subsequently, domain knowledge around correlates of obesity (a strong risk factor for CMB diseases) and school health-related fitness testing practicalities was leveraged to select optimal features most optimal considering the classification problem at hand. Classifiers were then developed using MATLAB Classification Learner Application first using the balanced dataset. Several models were fit using the balanced dataset and a variety of algorithms including, Decision Tree, Support Vector Machine (SVM), Naïve Bayes, and Ensemble. A 5-fold cross validation was employed to prevent overfitting in the training phase.

Resulting models were evaluated using Receiver Operating Characteristics curve analyses. Accuracy, associated Area Under Curve (AUC) (where AUC ≥ 8 is good discrimination), the True Positive Rate (TPR) (i.e., sensitivity or recall), and the False Positive Rate (FPR) (i.e., 1 - Specificity) indicated model performance. Overall, model saliency was adjudged considering the recall, precision, and F-Measure magnitudes, and performance when deployed to classify the test data. Precision refers to the capacity to identify only the relevant cases, while recall is the capacity to identify all cases of interest within a dataset. Maximizing precision decreases the incidence of false positives, while maximal recall reduces the instances of false negatives. F-Measure (harmonic mean of precision and recall) was also adopted, because it penalizes extreme values of precision and recall.

Statistical Analysis

Spearman and Pearson’s bivariate correlations were calculated and examined (Table 1 in the supplement). A maximum threshold of 0.899 was set to determine collinearity, such that two or more related features with a correlation equal to or greater than 0.9 were considered colinear. Correlation coefficients were considered significant at the 0.05 level (2-tailed), i.e., P<.05.

Discussion

This study examined the feasibility of developing highly accurate models to classify CMB risk in youth using features across demographic, anthropometric, and handgrip strength (i.e., muscular strength) data in a nationally representative youth sample. The top five features selected using Recurrent Feature Elimination, a wrapper method, produced the best performing CMB risk predicting models. Features, namely number of people in household, number of children five years or younger, annual household income, combined handgrip strength, height, and waist circumference were leveraged (Table 2). The most salient corresponding models were ones fit using Discriminant, Logistic, and SVM algorithms (Table 3B). When deployed, the Quadratic Discriminant model accurately classified 83% and 93% of the positive and negative classes within the test data, respectively, while the Logistic Regression model accurately classified 80% and 88% of the positive and negative classes, respectively (Table 3B). The Linear SVM model accurately classified 80% and 85% of the positive and negative classes, respectively (Table 3B). Other salient models with similar features and their respective performance metrics are listed in Table 3B.

Consistent with previous reports of varying performances across different classification algorithms, specific algorithms were more salient in this study owing to their superior performance involving specific clusters of features. For example, previous work found that Decision Tree and Support Vector Machine (SVM) outperformed Bayesian and Neural Networks at classifying childhood obesity using features that included push-up test, partial curl-up, and step-up in 12-year-old Malaysian children [32]. Similarly, Decision Tree outperformed Bayesian methods at classifying obesity in children after age two years [34]. In the current study, model performance varied across predictive algorithms, feature selection methods, and the resulting cluster of salient features. Specifically, SelectKBest method selected ratio of family income to poverty, combined handgrip strength, combined handgrip strength normalized to body weight, height, and waist circumference as the top five predictive features. The related Decision Tree, Discriminant, and KNN algorithms outperformed Logistic Regression, SVM, and Naïve Bayes models when evaluated using the test data and features were selected using SelectKBest. A limitation of filter methods is that it fails to account for the dependency between features and may resultantly not select the most important features [35]. Although the correlation between combined handgrip strength normalized and un-normalized to body weight did not meet the threshold for exclusion, it is interesting that only SelectKBest, a filter method, selected both as salient features. This appears consistent with its reported tendency to ignore interdependencies when selecting features [35]. In contrast, wrapper methods such as Recurrent Feature Elimination accounts for dependencies between features and outperforms filter methods at selecting the most important features [35]. Recurrent Feature Elimination selected number of people in household, annual household income, number of children 5 years or younger, combined handgrip strength, and waist circumference, but excluded combined handgrip strength normalized to body weight. Notably, related Discriminant, Logistic Regression, and SVM models outperformed Decision Tree, Naïve Bayes, and KNN algorithms when evaluated using the test data. Lastly, Random Forest, an embedded method, selected annual household income, ratio of family income to poverty, combined handgrip strength, height, and waist circumference as the five most salient features. Similar to Recurrent Feature Elimination, Random forest also excluded combined handgrip strength normalized to body weight. Related models, namely Decision Tree, SVM, and KNN outperformed Discriminant, Logistic Regression, and Naïve Bayes models when evaluated using the test data.

Although multiple performance metrics were evaluated (Tables 3A, 3B, and 3C), however, because the practical cost of misclassifying observations as “At Risk” (i.e., False Positive) is much less consequential than misclassifying observations as “Not At Risk” (i.e., False Negative), the balance between model sensitivity and precision (i.e., highest F-Measure score) was ultimately interpreted as having the greatest contextual significance in this study. Specifically, parents of a child with CMB risk who is predicted as not having CMB risk may not see the need to modify or adopt lifestyle factors such as increased physical activity, decreased sedentary time, and reduced sugar-sweetened beverage consumption, which will likely help control the risk. Even when families may contemplate some of these changes presumably resulting from encounters with other health promotion campaigns/exposures, knowing their child may be at risk for CMB disease will likely infuse a level of urgency that may not otherwise inform their decisions around lifestyle factors and seeking professional help. Therefore, the gravity of a false negative classification could be grave, especially in medically underserved communities, where children may not be routinely seen by a pediatrician. On the upside, these models are examined as potential tools to predict CMB risk, not diagnose CMB disease. Therefore, they could be highly valuable at identifying children who are at risk and serve as a basis to alert and connect parents or primary caregivers with resources such as affordable and free community-based medical and lifestyle factor services.

From the perspective of holding health as a shared value (i.e., where parents, school administration jointly value and prioritize student health), having scalable validated predictive models for CMB risk may be more efficient at ensuring all children are regularly screened and those who are most at risk for CMB disease are identified and provided support to help control risk. The current practice in the state of Texas is that school health services personnel evaluate students for acanthosis nigricans (a manifestation of insulin resistance) as the sole mechanism to screen for type II diabetes. Deploying accurate predictive models could potentially identify risk while physical manifestations such as acanthosis nigricans that precipitate CMB disease, are nascent or altogether absent. A screening protocol that includes handgrip strength and waist circumference could consolidate muscular fitness and health screenings and increase collaboration between PE teachers and school nurses, thereby

Strengths and Weaknesses

This study has several strengths. The implementation of SMOTE allowed several models to be trained on a balanced dataset. This increased confidence to examine other models besides Naïve Bayes, which was previously shown to be tolerant of data imbalance compared to other classification algorithms [39]. As such, any concerns around potential bias that may skew model performance towards the majority class owing to an imbalanced training set was mitigated. The use of a balanced dataset yielded relatively highly accurate models that are potentially scalable in settings like schools where the first line of screening for metabolic disease involves evaluating students for an observable manifestation of hyperinsulinemia (i.e., acanthosis nigricans). This is the first study to demonstrate the feasibility of developing models that could considerably improve surveillance by alerting school health services personnel to children who might be at risk for CMB disease precipitators even in their nascent stages. The salience of handgrip strength as an important feature opens up the prospect of consolidating and deploying the same surveillance models to both predict CMB risk, evaluate muscular fitness, and optimally inform primary care referrals and preventive services. Lastly, this study leveraged existing NHANES data related to chronic disease surveillance, thereby precluding the need for new data collection and associated resources.

This study has several limitations, including a relatively small dataset. Further, although the data was from a nationally representative sample, only 28% of the original dataset had greater than three individual risk factors (e.g., hypertension, high glucose, high triglycerides, etc.) and was therefore categorized as having CMB risk. While the dataset was synthetically balanced, a documented disadvantage of SMOTE is that while it synthetically generating data points, it fails to consider that neighboring data points can be from other classes. This failure can result in increased overlap between classes, thereby introducing additional noise to the dataset. However, the imbalanced dataset used was preprocessed and grouped such that neighboring data points belonged to the same class. It does not appear such noise introduction adversely impacted the models developed following SMOTE implementation in this study as evidenced by their superior performance metrics over the models trained on the imbalanced dataset. Also, the data is cross-sectional in nature. As such, current models do not establish any causal longitudinal relationships between salient features and CMB risk. Notably, even the best performing models resulted in false negatives and/or false positives in the order of low, albeit double-digit percentages. Additionally, these models have yet to be externally validated using data from an unrelated sample. Lastly, body weight (kg) rather than body mass (kg) was used in this study in order to maintain consistence with language in the widely known NHANES dataset. The current models are not intended to diagnose chronic disease; rather, they predict cross sectional risk of chronic disease related to CMB risk clustering.

References

Stodden D, Sacko R, Nesbitt D: A Review of the Promotion of Fitness Measures and Health Outcomes in Youth. Am J Lifestyle Med 2017, 11(3):232-242.
Wolfe RR: The underappreciated role of muscle in health and disease. Am J Clin Nutr 2006, 84(3):475-482.
Smith JJ, Eather N, Morgan PJ, Plotnikoff RC, Faigenbaum AD, Lubans DR: The health benefits of muscular fitness for children and adolescents: a systematic review and meta-analysis. Sports medicine 2014, 44(9):1209-1223.
Burns RD, Brusseau TA: Muscular strength and endurance and cardio-metabolic health in disadvantaged Hispanic children from the U.S. Prev Med Rep 2017, 5:21-26.
Ortega FB, Ruiz JR, Castillo MJ, Sjostrom M: Physical fitness in childhood and adolescence: a powerful marker of health. Int J Obes (Lond) 2008, 32(1):1-11.
Zhang R, Li C, Liu T, Zheng L, Li S: Handgrip Strength and Blood Pressure in Children and Adolescents: Evidence From NHANES 2011 to 2014. Am J Hypertens 2018, 31(7):792-796.
Peterson MD, Saltarelli WA, Visich PS, Gordon PM: Strength capacity and cardiometabolic risk clustering in adolescents. Pediatrics 2014, 133(4):e896-903.
Garcia-Hermoso A, Vegas-Heredia ED, Fernandez-Vergara O, Ceballos-Ceballos R, Andrade-Schnettler R, Arellano-Ruiz P, Ramirez-Velez R: Independent and combined effects of handgrip strength and adherence to a Mediterranean diet on blood pressure in Chilean children. Nutrition 2019, 60:170-174.
Morrow JR, Jr., Martin SB, Jackson AW: Reliability and validity of the FITNESSGRAM: quality of teacher-collected health-related fitness surveillance data. Res Q Exerc Sport 2010, 81(3 Suppl):S24-30.
Zhu W, Welk GJ, Meredith MD, Boiarskaia EA: A survey of physical education programs and policies in Texas schools. Res Q Exerc Sport 2010, 81(3 Suppl):S42-52.
von Haaren-Mack B, Schaefer A, Pels F, Kleinert J: Stress in Physical Education Teachers: A Systematic Review of Sources, Consequences, and Moderators of Stress. Res Q Exerc Sport 2019:1-19.
Suminski RR, Blair RI, Lessard L, Peterson M, Killingsworth R: Physical education teachers' and principals' perspectives on the use of FitnessGram. SAGE Open Med 2019, 7:2050312119831515.
McKenzie TL, Sallis JF, Nader PR, Broyles SL, Nelson JA: Anglo- and Mexican-American preschoolers at home and at recess: activity patterns and environmental influences. J Dev Behav Pediatr 1992, 13(3):173-180.
McKenzie TL, Sallis JF, Elder JP, Berry CC, Hoy PL, Nader PR, Zive MM, Broyles SL: Physical activity levels and prompts in young children at recess: a two-year study of a bi-ethnic sample. Res Q Exerc Sport 1997, 68(3):195-202.
O’Connor TM, Cerin E, Hughes SO, Robles J, Thompson D, Baranowski T, Lee RE, Nicklas T, Shewchuk RM: What Hispanic parents do to encourage and discourage 3-5 year old children to be active: a qualitative study using nominal group technique. International Journal of Behavioral Nutrition and Physical Activity 2013, 10(93).
Ruiz R, Gesell SB, Buchowski MS, Lambert W, Barkin SL: The Relationship Between Hispanic Parents and Their Preschool-Aged Children's Physical Activity. Pediatrics 2011, 127(5):888-895.
Armstrong B, Lim CS, Janicke DM: Park Density Impacts Weight Change in a Behavioral Intervention for Overweight Rural Youth. Behav Med 2015, 41(3):123-130.
Evenson KR, Arredondo EM, Carnethon MR, Delamater AM, Gallo LC, Isasi CR, Perreira KM, Foti SA, van Horn L, Vidot DC et al: Physical Activity and Sedentary Behavior among US Hispanic/Latino Youth: The SOL Youth Study. Med Sci Sports Exerc 2019, 51(5):891-899.
Ogden CL, Carroll MD, Kit BK, Flegal KM: Prevalence of obesity and trends in body mass index among US children and adolescents, 1999-2010. JAMA 2012, 307(5):483-490.
Lawrence JM, Mayer-Davis EJ, Reynolds K, Beyer J, Pettitt DJ, D'Agostino RB, Jr., Marcovina SM, Imperatore G, Hamman RF, Group SfDiYS: Diabetes in Hispanic American youth: prevalence, incidence, demographics, and clinical characteristics: the SEARCH for Diabetes in Youth Study. Diabetes Care 2009, 32 Suppl 2:S123-132.
Chukwueke I, Cordero-Macintyre Z: Overview of type 2 diabetes in Hispanic Americans. Int J Body Compos Res 2010, 8(Supp):77-81.
Ajisafe T, Garcia T, Fanchiang H: Musculoskeletal fitness measures are not created equal: an assessment of school children in Corpus Christi, Texas. Front Public Health - Child Health and Human Development 2018, 6(142):1-11.
Studies to Treat or Prevent Pediatric Type 2 Diabetes Prevention Study Group. Prevalence of the metabolic syndrome among a racially/ethnically diverse group of U.S. eighth-grade adolescents and associations with fasting insulin and homeostasis model assessment of insulin resistance levels. 2008. Diabetes Care, 31(10):2020–2025.
Tiu GF, Leroy ZC, Lee SM, Maughan ED, Brener ND: Characteristics Associated With School Health Services for the Management of Chronic Health Conditions. J Sch Nurs 2019:1059840519884626.
Adegboye AR, Anderssen SA, Froberg K, Sardinha LB, Heitmann BL, Steene-Johannessen J, Kolle E, Andersen LB: Recommended aerobic fitness level for metabolic health in children and adolescents: a study of diagnostic accuracy. Br J Sports Med 2011, 45(9):722-728.
Lobelo F, Pate RR, Dowda M, Liese AD, Ruiz JR: Validity of cardiorespiratory fitness criterion-referenced standards for adolescents. Med Sci Sports Exerc 2009, 41(6):1222-1229.
Kong AS, Williams RL, Smith M, Sussman AL, Skipper B, Hsi AC, Rhyne RL, Clinicians RN: Acanthosis nigricans and diabetes risk factors: prevalence in young persons seen in southwestern US primary care practices. Ann Fam Med 2007, 5(3):202-208.
Spurr S, Bally J, Hill P, Gray K, Newman P, Hutton A: Exploring the Prevalence of Undiagnosed Prediabetes, Type 2 Diabetes Mellitus, and Risk Factors in Adolescents: A Systematic Review. J Pediatr Nurs 2020, 50:94-104.
Selya AS, Anshutz D: Machine Learning for the Classification of Obesity from Dietary and Physical Activity Patterns In: Giabbanelli P., Mago V., Papageorgiou E. (eds) Advanced Data Analytics in Health. Smart Innovation, Systems and Technologies, vol 93. Springer, Cham. In: Giabbanelli P, Mago V, Papageorgiou E (eds) Advanced Data Analytics in Health Smart Innovation, Systems and Technologies, vol 93 Springer, Cham 2018.
Ajisafe T, Um D: Exploring the feasibility of classifying fundamental locomotor skills using an instrumented insole and machine learning techniques. In: Duffy VG (Ed) Digital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management - Human Body and Motion Lecture Notes in Computer Science, vol 11581 Springer, Cham 2019.
Zheng Y, Xie J, Vo TVT, Lee B, Ajisafe T: Predicting Daily Physical Activity Level for Older Adults using Wearable Activity Trackers. In: Zhou, J, Salvendy, G (Eds) Human Aspects of IT for the Aged Population - Social Media, Games, and Assistive Environments Lecture Notes in Computer Science, vol 11593 Springer, Cham 2019.
Abdullah FS, Abd Manan NS, Ahmad A, Wafa SW, Shahril MR, Zulaily N, Amin RM, Ahmed A: Data Mining Techniques for Classification of Childhood Obesity Among Year 6 School Children. Adv Intell Syst 2017, 549:465-474.
Kurisu K, Yoshiuchi K, Ogino K, Oda T: Machine learning analysis to identify the association between risk factors and onset of nosocomial diarrhea: a retrospective cohort study. PeerJ 2019, 7:e7969.
Dugan TM, Mukhopadhyay S, Carroll A, Downs S: Machine Learning Techniques for Prediction of Early Childhood Obesity. Appl Clin Inform 2015, 6(3):506-520.
Ang JC, Mirzal A, Haron H, Hamed H: Supervised, unsupervised and semisupervised feature selection: a review on gene selection. IEEE/ACM Trans Comput Biol Bioinform 2015, 13(5):971 - 989.
Jain D, Singh V: Feature selection and classification systems for chronic disease prediction: A review. Egyptian Informatics Journal 2018, 19:179-189.
Blagus R, Lusa L: SMOTE for high-dimensional class-imbalanced data. BMC Bioinformatics 2013, 14:106.
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP: SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research 2002, 16(2002):321–357.
Ramezankhani A, Pournik O, Shahrabi J, Azizi F, Hadaegh F, Khalili D: The Impact of Oversampling with SMOTE on the Performance of 3 Classifiers in Prediction of Type 2 Diabetes. Med Decis Making 2016, 36(1):137-144.
Sacheck JM, Amin SA: Cardiorespiratory Fitness in Children and Youth: A Call for Surveillance, But Now How Do We Do It? Exerc Sport Sci Rev 2018, 46(2):65.
Faigenbaum AD, Myer GD, Farrell A, Radler T, Fabiano M, Kang J, Ratamess N, Khoury J, Hewett TE: Integrative neuromuscular training and sex-specific fitness performance in 7-year-old children: an exploratory investigation. J Athl Train 2014, 49(2):145-153.
Kuczmarski RJ, Ogden CL, Guo SS, Grummer-Strawn LM, Flegal KM, Mei Z, Wei R, Curtin LR, Roche AF, Johnson CL: 2000 CDC Growth Charts for the United States: methods and development. Vital Health Stat 11 2002(246):1-190.
Racette SB, Yu L, DuPont NC, Clark BR: BMI-for-age graphs with severe obesity percentile curves: tools for plotting cross-sectional and longitudinal youth BMI data. BMC Pediatr 2017, 17(1):130.
National Health and Nutrition Examination Survey (NHANES) Anthropometry Procedures Manual [online]. 2013. [accessed March 20, 2019]. URL: https://wwwn.cdc.gov/nchs/data/nhanes/2013-2014/manuals/2013_Anthropometry.pdf
Westgard JO, Barry PL, Hunt MR, Groth T: A multi-rule Shewhart chart for quality control in clinical chemistry. Clin Chem 1981, 27(3):493-501.
Wallace TM, Levy JC, Matthews DR: Use and abuse of HOMA modeling. Diabetes Care 2004, 27(6):1487-1495.
Saydah S, Bullard KM, Imperatore G, Geiss L, Gregg EW: Cardiometabolic Risk Factors Among US Adolescents and Young Adults and Risk of Early Mortality. Pediatrics 2013, 131(3):e679–e686. doi:610.1542/peds.2012-2583.
Fox CK, Kaizer AM, Ryder JR, Rudser KD, Kelly AS, Kumar S, Gross AC, Group PW: Cardiometabolic risk factors in treatment-seeking youth versus population youth with obesity. Obes Sci Pract 2018, 4(3):207-215.
Cook S, Weitzman M, Auinger P, Nguyen M, Dietz WH: Prevalence of a metabolic syndrome phenotype in adolescents: findings from the third National Health and Nutrition Examination Survey, 1988-1994. Arch Pediatr Adolesc Med 2003, 157(8):821-827.
Chun SW, Kim W, Choi KH: Comparison between grip strength and grip strength divided by body weight in their relationship with metabolic syndrome and quality of life in the elderly. PLOS ONE 2019, 14(9).

Tables

Table 1. Descriptive and anthropometric attributes (Mean ± SD) of the datasets leveraged prior to and following the train-test data split and subsequent oversampling.

Dataset Attribute

Dataset (prior to split)

Training Set (prior to oversampling)

Training Set (after oversampling)

Number of records

Percentage of females

Percentage of males

Age (years)

402

49%

51%

15.4 ± 1.8

322

49%

51%

14.6 ± 1.8

514

46%

54%

14.6 ± 1.8

Height (cm)

Body weight (kg)

Percent Ethnic Groups

Mexican American

Other Hispanic

Non-Hispanic White

Non-Hispanic Black

Non-Hispanic Asian

Other Race

167.4 ± 9.1

73.2 ± 21.5

164.9 ± 10.1

63.9 ± 19.1

165.9 ± 9.9

70.7 ± 21.3

20%

25%

29%

13%

19%

10%

25%

29%

13%

17%

10%

24%

30%

16%

Table 2. Feature Selection algorithms used and the resulting five most salient features

Algorithm	5 Most Salient Features
SelectKBest	RFIP	CHGS	CHGSW	Height	Waist Circumference
Recurrent Feature Elimination	NPHH	NCFYY	AHHI	CHGS	Waist Circumference
Decision Trees	AHHI	RFIP	CHGS	Height	Waist Circumference

Abbreviations: Ratio of family Income to Poverty, RFIP; Number of People in Household, NPHH; Annual Household Income, AHHI; Number of Children Five Years or Younger, NCFYY; Combined Handgrip Strength, CHGS; Combined Handgrip Strength normalized to body weight, CHGSW.

Table 3A. Salient CMB risk classification models with associated performance metrics, including ROC Curve Analysis and Confusion Matrix outcomes, and model evaluation results. Features leveraged were selected using SelectKBest algorithm.

Model	Accuracy (%)	AUC	FPR	FNR	Recall	Precision	F-Measure
*Model Training* Coarse Tree	79	0.81	0.24	0.18	0.82	0.77	0.80
Quadratic Discriminant	79.7	0.85	0.23	0.17	0.83	0.78	0.81
Logistic Regression Kernel Naïve Bayes	79.6 82.3	0.85 0.84	0.22 0.21	0.19 0.14	0.81 0.86	0.79 0.80	0.80 0.83
Quadratic SVM Weighted KNN	79.6 85.2	0.85 0.94	0.25 0.25	0.16 0.05	0.84 0.95	0.77 0.79	0.80 0.86

*Model Evaluation* Coarse Tree	Positive 85.0	Negative 77.5
Quadratic Discriminant	80.0	80.0
Logistic Regression Kernel Naïve Bayes Quadratic SVM Weighted KNN	72.5 80.0 72.5 80.0	82.5 72.5 80.0 80.0

Features leveraged are ratio of family income to poverty, combined handgrip strength, combined handgrip strength normalized to body weight, height, and waist circumference. Abbreviations: Receiver Operating Characteristics, ROC; Area Under Curve, AUC; False Positive Rate, FPR; False Negative Rate, FNR; Support Vector Machine, SVM; K-Nearest Neighbor, KNN.

Table 3B. Salient CMB risk classification models with associated performance metrics, including ROC Curve Analysis and Confusion Matrix outcomes, and model evaluation results. Features leverage were selected using Recurrent Feature Elimination algorithm.

Model	Accuracy (%)	AUC	FPR	FNR	Recall	Precision	F-Measure
*Model Training* Coarse Tree	79.8	0.84	0.24	0.16	0.84	0.78	0.81
Quadratic Discriminant	81.7	0.87	0.19	0.18	0.82	0.81	0.82
Logistic Regression Kernel Naïve Bayes	79.8 82.9	0.86 0.85	0.21 0.21	0.19 0.14	0.81 0.86	0.79 0.80	0.80 0.83
Linear SVM Coarse KNN	80.2 81.9	0.86 0.89	0.23 0.20	0.17 0.16	0.83 0.84	0.78 0.81	0.81 0.82

*Model Evaluation* Coarse Tree	Positive 82.5	Negative 75.0
Quadratic Discriminant	82.5	92.5
Logistic Regression Kernel Naïve Bayes Linear SVM Coarse KNN	80.0 82.5 80.0 82.5	87.5 72.5 85.0 77.5

Features leveraged are number of people in household, annual household income, number of children 5 years or younger, combined handgrip strength, and waist circumference. Abbreviations: Receiver Operating Characteristics, ROC; Area Under Curve, AUC; False Positive Rate, FPR; False Negative Rate, FNR; Support Vector Machine, SVM; K-Nearest Neighbor, KNN.

Table 3C. Salient CMB risk classification models with associated performance metrics, including ROC Curve Analysis and Confusion Matrix outcomes, and model evaluation results. Features leveraged were selected using SelectKBest algorithm.

Model	Accuracy (%)	AUC	FPR	FNR	Recall	Precision	F-Measure
*Model Training* Medium Tree	79.4	0.83	0.23	0.18	0.82	0.78	0.80
Quadratic Discriminant	79.4	0.86	0.21	0.20	0.80	0.79	0.80
Logistic Regression Gaussian Naïve Bayes	78.6 78.4	0.86 0.82	0.22 0.23	0.21 0.20	0.79 0.80	0.78 0.78	0.79 0.79
Quadratic SVM Fine KNN	80.2 87.7	0.87 0.88	0.25 0.21	0.15 0.04	0.85 0.96	0.77 0.82	0.81 0.89

*Model Evaluation* Medium Tree	Positive 82.5	Negative 80.0
Quadratic Discriminant	72.5	82.5
Logistic Regression Gaussian Naïve Bayes Quadratic SVM Fine KNN	72.5 77.5 82.5 80.0	80.0 72.5 80.0 82.5

Features leveraged are ratio of family income to poverty, annual household income, combined handgrip strength, height, and waist circumference. Abbreviations: Receiver Operating Characteristics, ROC; Area Under Curve, AUC; False Positive Rate, FPR; False Negative Rate, FNR; Support Vector Machine, SVM; K-Nearest Neighbor, KNN. Subspace Discriminant are Ensemble models.

Developing cardiometabolic risk classifiers for youth using handgrip strength, anthropometrics, and demographics: a machine learning approach leveraging National Health and Nutrition Examination Survey Data

Abstract

Background

Methods

Results

Discussion

Conclusions

Abbreviations

Declarations

References

Tables