Comparison of Statistical Methods for Estimating Continuous Paediatric Reference Intervals: A Simulation Study
Background: Reference intervals (RIs), which are used as an assessment tool in laboratory medicine, change with age for most biomarkers in children. Addressing this, RIs that vary continuously with age have been developed using a range of curve-fitting approaches. The choice of statistical method may be important as different methods may produce substantially different RIs. Hence, we developed a simulation study to investigate the performance of statistical methods for estimating continuous paediatric RIs.
Methods: We compared four methods for estimating age-varying RIs. These were Cole’s LMS, the Generalised Additive Model for Location Scale and Shape (GAMLSS), Royston’s method based on fractional polynomials and exponential transformation, and a new method applying quantile regression using power variables in age selected by fractional polynomial regression for the mean. Data were generated using hypothetical true curves based on five biomarkers with varying complexity of association with age, i.e. linear or nonlinear, constant or nonconstant variation across age, and for four sample sizes (100, 200, 400 and 1000). Root mean square error (RMSE) was used as the primary performance measure for comparison.
Results: Regression-based parametric methods performed better in most scenarios. Royston’s and the new method performed consistently well in all scenarios for sample sizes of at least 400, while the new method had the smallest average RMSE in scenarios with nonconstant variation across age.
Conclusions: We recommend methods based on flexible parametric models for estimating continuous paediatric RIs, irrespective of the complexity of the association between biomarkers and age, for at least 400 samples.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Due to technical limitations, full-text HTML conversion of this manuscript could not be completed. However, the manuscript can be downloaded and accessed as a PDF.
This is a list of supplementary files associated with this preprint. Click to download.
Supplementary document 1: Stata and R codes of all four statistical methods applied in estimating age-specific paediatric RIs 1 A: Stata code for Cole’s LMS method 1 B: Stata code for Royston’s method 1 C: State code for Hoq et al’s method 1 D: R code for GAMLSS
Supplementary Figures 1 to 5: Average estimated and true RIs by age, sex and method for five scenarios Figure S-1: Average estimated and true RIs by age, sex, sample size and method for scenario 1 Figure S-2: Average estimated and true RIs by age, sex, sample size and method for scenario 2 Figure S-3: Average estimated and true RIs by age, sex, sample size and method for scenario 3 Figure S-4: Average estimated and true RIs by age, sex, sample size and method for scenario 4 Figure S-5: Average estimated and true RIs by age, sex, sample size and method for scenario 5 Supplementary Figures 6 – 8: Root mean square error by age, sex and method for three scenarios Figure S-6: Root Mead Square Error by age, sex, sample size and method for scenario 3 Figure S-7: Root Mead Square Error by age, sex, sample size and method for scenario 4 Figure S-8: Root Mead Square Error by age, sex, sample size and method for scenario 5
Supplementary table 1 and 2 Table S-1: Average coverage across integer age and sex for Hoq et al and Royston’s methods, for five scenarios and four sample sizes Table S-2: Average 95% confidence interval width for lower and upper reference limits for Hoq et al and Royston’s methods across integer age and sex, for five scenarios and four sample sizes
Posted 16 Jan, 2021
Comparison of Statistical Methods for Estimating Continuous Paediatric Reference Intervals: A Simulation Study
Posted 16 Jan, 2021
Background: Reference intervals (RIs), which are used as an assessment tool in laboratory medicine, change with age for most biomarkers in children. Addressing this, RIs that vary continuously with age have been developed using a range of curve-fitting approaches. The choice of statistical method may be important as different methods may produce substantially different RIs. Hence, we developed a simulation study to investigate the performance of statistical methods for estimating continuous paediatric RIs.
Methods: We compared four methods for estimating age-varying RIs. These were Cole’s LMS, the Generalised Additive Model for Location Scale and Shape (GAMLSS), Royston’s method based on fractional polynomials and exponential transformation, and a new method applying quantile regression using power variables in age selected by fractional polynomial regression for the mean. Data were generated using hypothetical true curves based on five biomarkers with varying complexity of association with age, i.e. linear or nonlinear, constant or nonconstant variation across age, and for four sample sizes (100, 200, 400 and 1000). Root mean square error (RMSE) was used as the primary performance measure for comparison.
Results: Regression-based parametric methods performed better in most scenarios. Royston’s and the new method performed consistently well in all scenarios for sample sizes of at least 400, while the new method had the smallest average RMSE in scenarios with nonconstant variation across age.
Conclusions: We recommend methods based on flexible parametric models for estimating continuous paediatric RIs, irrespective of the complexity of the association between biomarkers and age, for at least 400 samples.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Due to technical limitations, full-text HTML conversion of this manuscript could not be completed. However, the manuscript can be downloaded and accessed as a PDF.