Study Population
We performed a cross-sectional study based on the electronic medical record system (EMRS) in the Children’s Hospital, Fudan University, Shanghai, China. The EMRS systematically collected information on patient’s demographics, medical history, results of physical examination and laboratory test, radiology images, diagnosis, and treatment for each time they visited the hospital. The study population was extract from the database including all patients who came to the hospital from January 2010 to August 2016. We included patients according to inclusion criteria as follow: (1) girls with a diagnosis of precocious puberty; (2) age of 8 years old or less on her first visit the hospital; (3) hormone profile (including GnRHa stimulation test) and pelvic ultrasonography was performed in the Children’s Hospital, Fudan University; (4) pelvic ultrasonography was performed within one week of the GnRHa stimulation test. Patients with secondary precocious puberty, central nervous system lesion (congenital or acquired), and ovarian cyst were excluded for its possible effects on the HPGA.
This study was approved by the Ethics Committees of the Children’s Hospital, Fudan University, Shanghai, China.
Gold standard
GnRHa stimulation test was used as the gold standard diagnosis of CPP [1, 7]. Patients with stimulated peak LH ≥ 5 IU/L, and peak LH-to-FSH ratio ≥ 0.6 were diagnosed as CPP [1, 8–10, 17]. Details of the GnRHa stimulation test have been published elsewhere [18]. LH and FSH concentration was measured using electrochemiluminescence assay (COBASE 602, Roche, Switzerland). The limit of detection (LOD) of LH and FSH was 0.2 IU/L. Stimulated LH, basal and stimulated FSH concentration was above the LOD in all participants. Basal LH level was below the LOD in 1.8% (11/627) patients [2.5% (8/314) and 1.0% (3/313) in the training and validation sample, respectively].
Pelvic Ultrasound Evaluation
Transabdominal ultrasonography was performed on patients utilizing a curvilinear 2–7 MHz probe. All pelvic ultrasonograms were obtained with Philips IU22 ultrasound units equipped with duplex/color-flow Doppler broad bandwidth transducers (Phillips, Netherlands). The pediatric radiologist had no information on the results of the GnRHa stimulation test. Ovarian volume for each side was calculated based on the ellipse volume formula: 0.5233*length*depth*breadth. Average ovarian volume was calculated as: (right ovary volume + left ovary volume)/2. The largest and smallest ovarian volume was defined as the larger and smaller volume between the right and left ovary volume. Uterine volume was calculated according to the same ellipse volume formula. The values of sonographic characteristics were stratified into categories (ovarian volume: <1 mL, 1-<2 mL, and ≥ 2 mL; uterine length: <3 cm, 3-<4 cm, and ≥ 4 cm; uterine volume: <3 mL, 3-<4 mL, and ≥ 4 mL; uterine configuration with the thickness of endometrial stripe: <0.2 cm and ≥ 0.2 cm) [16].
Medical history, physical examination and bone age
A complete medical history and results of the physical examination were abstracted from the database. Breast and pubic hair development was assessed according to the Tanner staging criteria [1]. The bone age (BA) was measured using the Greulich PyIe (GP) method [19].
Statistical Analysis
A random sample of one half of the patients was obtained to develop a clinical prediction model (training sample), leaving the other half of the patients for validation (validation sample). We first compared the clinical characteristics and pelvic ultrasonography between the training and validation sample using a quantitative (t test or Wilcoxon rank sum test) or qualitative (χ2 test) test as appropriate. Then we built crude logistic regression models to evaluate the association between potential predictors and CPP. A total of 30 variables contained information on medical history, progression of pubertal manifestations, basal hormone level, and pelvic ultrasonography were selected a priori according to previous studies (See Additional file 1[Additional Table 1]) [1, 7, 16]. Predictors with a P value of less than 0.20 were entered into a multivariable logistic regression model. The prediction model was selected using forward stepwise analysis (P = 0.05 include, P > 0.10 removed). Performance of the selected model was assessed using C-index, calibration based on Hosmer-Lemeshow test [20]. We performed the internal validation using bootstrap resampling [21].
A risk score model based on the final logistic regression model was derived using the method proposed by Sullivan et al [22]. In the risk score system, estimate risks calculated based on point totals were approximate to the prediction of the logistic regression model. The statistical methods are described in more detailed in Additional file 2. The performance of risk score was measured using C-index, calibration, sensitivity, specificity, positive likelihood ratio (LR+) and negative likelihood ratio (LR-) [20]. High- and low- risk cut points for the CPP risk score were determined by consensus of a team of two experienced pediatric endocrinologists, two pediatric radiologists, and an epidemiologist. Validation was performed in the other half of the patients. Performance of the CPP risk score model in the validation sample was measured as well [20].
Statistical analyses were performed using SAS statistical software version 9.2 (SAS Institute Inc., Cary, NC, USA).