Development and Validation of a Nomogram for the Early Prediction of Iga Nephropathy in Primary Glomerulonephritis


 Background: To make early prediction of immunoglobulin A nephropathy (IgAN) before renal biopsy, we developed and validated a new non-invasive nomogram for the early prediction of IgAN in primary glomerulonephritis (GN) in south China. Methods: A total of 431 patients were included in this study and additional 113 patients were included as the independent test cohorts to validate our results. A stepwise regression model was used for features selection. Multivariate logistic regression analysis with 5-fold cross validation was used to validate the result of the stepwise selection. Performance of the logistic regression model was assessed with respect to its calibration, discrimination, and clinical usefulness. Independent test was assessed. Results: We developed a model incorporating age of patients and four clinical chemistry signatures, including serum IgA, serum albumin (ALB), serum phosphorus (P) and 24-hour urinary protein (24hUpro) and presented with a nomogram. The area under the receiver operating characteristic (ROC) curve (AUC) reached 0.89 (95% CI: 0.86–0.92) and 0.88 (95%: 0.85-0.92) in the training set and validation set, respectively. The model also had good performance in independent test cohorts (AUC of 0.82, 95% CI: 0.75–0.90). The DCA and calibration plot of the model also shows good performance. Conclusions: The logistic regression model presented in this study incorporates age of patients, IgA, ALB, P and 24hUpro and can be conveniently used to facilitate the individualized prediction of IgAN.


Background
Immunoglobulin A nephropathy (IgAN) is the most common type of primary glomerulonephritis (GN) globally [1,2] , especially in the Asia-Paci c region [2,3] . IgAN is characterized by the predominant deposition of IgA in the mesangium of glomeruli [4] . The range of clinical manifestations of IgAN is broad, from asymptomatic microscopic hematuria to rapidly progressive GN [5] . The typical mode of presentation varies according to age group and biopsy practice patterns. Epidemiological investigation showed that IgAN accounts for ~45.3% of primary GN cases in China [6] . IgAN reduces life expectancy by more than 10 years and leads to kidney failure in 20-40% of patients within 20 years after the diagnostic renal biopsy [7] . Therefore, it's vital for the early diagnosis and intervention of IgAN.
Until now, the diagnosis of IgAN relies on revealing IgA as the dominant or codominant immunoglobulin in the glomerular mesangium by renal biopsy [8] . However, biopsy registry data underestimate disease burden as patients with mild disease may not undergo renal biopsy, and in countries lacking screening programs disease may not be detected [5] . And renal biopsy is invasive and cannot be repeated frequently to evaluate therapeutic effect. Therefore, it is necessary to develop novel, non-invasive diagnostic model for IgAN.
As a statistical tool, a nomogram can solve the above problems in a more accurate way. A large number of studies showed that a nomogram can predict the prognosis of some kidney disease [9][10][11] . Mao Yonghui et al. [9] proposed a predictive model for the progression of patients with primary membranous nephropathy and nephrotic syndrome based on age, sPLA2R-Ab, proteinuria and Uα1m/Cr. Another study from China developed and assessed a predictive nomogram for the progression of IgA nephropathy [12] . However, there are few reports about the diagnostic model of IgAN form primary GN. In particular, non-invasive diagnostic models are scarcer. Therefore, the purpose of this study is to develop and validate an accurate, valid and simple noninvasive nomogram model to predict IgAN from primary GN to help clinicians make clinical decision.

Study Design and Patients
This is a retrospective monocentric analysis of patients who underwent biopsy-proven primary GN at Peking University Shenzhen Hospital between January 2019 and May 2021. Study participants split into two independent set. 431 patients with primary GN enrolled at Peking University Shenzhen Hospital from January 2019 to December 2020 as training cohorts. Another independent test cohorts included 113 patients from January 2021 to May 2021 were enrolled as validation cohorts to validate our results. All study participants provided their written informed consent.
The inclusion criteria were as follows: with biopsy-proven primary GN; 18 years or older; with eGFRs ≥ 30 mL/min/1.73 m 2 at the time of renal biopsy. We excluded those with clinical or serologic evidence of systemic lupus erythematosus, antinuclear cytoplasmic antibody-associated vasculitis, Henoch-Sch€onlein purpura, or liver cirrhosis and those with urinary tract infection or obstruction at the time of biopsy.
Estimated glomerular ltration rate (eGFR in mL/min/1.73 m 2 ) was calculated by the equation put forward by the Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) and was categorized according to the KDIGO 2012 Clinical Practice Guideline [13] .
Glomerular membranous nephropathy lesions were classi ed into four stages (I, II, III and IV) based on Ehrenreich and Churg's criteria. IgAN histologic lesions were graded according to MEST-C score [14] . Characteristics of patients and clinical and laboratory data are presented as frequency (percentages), mean (SD) for normally distributed variables and median (IQR) for non-normally distributed variables. Stepwise regression analysis using the Akaike information criterion (AIC) was used to identify candidates to be incorporated into the multivariable logistic regression model. The odds ratio (OR), 95% con dence interval (CI) and p value were calculated for each candidate to predict possible diagnosis. Selected candidates were incorporated into the multivariate logistic regression model and presented with nomogram to visualize the probability of IgAN diagnosis using Deepwise & Beckman Coulter DxAI platform in the training cohorts.

Statistical Analysis
We applied both internal validation and independent test cohorts to evaluate the performance of logistic regression model. 5-fold cross validation was used as internal validation to develop the model in the training set. A calibration plot was used to assess the consistency between predicted probability and actual probability of the model [15] . To quantify the discrimination performance of the model, the area under ROC curve (AUC) was calculated. The performance of the model was also tested in the independent test cohorts. The logistic regression formula formed in the training set was applied to patients in the independent test set, with calculation of the total points for each patient. Then, logistic regression analysis was performed using the total points as a factor. Decision curve analysis (DCA) is a method for evaluating the bene ts of a diagnostic test across a range of patient preferences for accepting risk of undertreatment and overtreatment to facilitate decisions about test selection and use [16] .

Page 4/14
A total of 544 patients with biopsy-proven primary GN were enrolled in this study (shown in Fig. 1). Median age at biopsy was 37 (IQR 30-47) years. The male/female ratio was 1.125: 1 (288/256). The median eGFR was 78.08 (IQR 56.12-103.23) mL/min/1.73 m 2 , and the 24hUpro was 1.62 (IQR 0.74-3.72) g/24 h. The study population consisted of two independent datasets, training set (431 patients) and independent test set (113 patients). The clinical characteristics were similar in the 2 sets (shown in Table 1). The percentage of IgAN was 56.4% (243/431) and 54.0% (61/113) in the training and validation set, respectively. Patient characteristics of IgAN and non-IgAN in the 2 groups are shown in the Table 1.  Table 2 shows the regression coe cients and risk scores for each variable. These potential features were age, IgA, ALB, P and 24hUpro. A model including the 5 potential features was developed and presented as the nomogram (shown in Fig. 2). It presents each variable in the nomogram which was assigned a speci c score, and the total score which was used to obtain probability for predicting IgAN from primary GN (shown in Fig. 2). Apparent Performance of the model  Fig. 3. a). Also, the model had 0.74 sensitivity (recall), 0.79 speci city, 0.76 accuracy and 0.80 precision in the independent test. Furthermore, in the training set, validation set and independent test set, the calibration plot indicated a consistency between the observed result and prediction for the probability of IgAN (shown in Fig. 3. b).

Decision curve analysis
The result of DCA for the nomogram is presented in Figure 4. It indicated that the nomogram had an excellent performance in clinical practice. There was a broad spectrum of alternative threshold probability, suggesting that the model was a good assessment tool.

Discussion
Increasingly, nomogram is used by clinicians to estimate and predict the risk of disease for patients, and include easy-to-use digital interfaces that improve diagnostic and predictive e ciency by incorporating multiple independent predictors. In this study, we developed and validated a non-invasive diagnostic prediction model that integrates 5 independent features (IgA, ALB, P, 24hUpro and age) for the prediction of IgAN in primary GN. This model displayed an excellent level of discrimination and a high AUC of 0.89. Besides, the model was validated by 5-fold cross validation with an AUC of 0.88. The independent test also certi ed a good level of discrimination with an AUC of 0.82. The calibration plot indicated good consistency between the actual and predicted diagnoses.
Currently, renal biopsy still is the gold standard for clinical diagnosis and evaluation the degrees of IgA nephropathy [17] . However, renal biopsy is an invasive method and is not suitable to biopsy patients with mild disease and in countries and districts lacking screening programs disease [5] . Therefore, study of non-invasive clinical features related to the diagnosis of IgAN and development of non-invasive diagnosis model have important clinical signi cance for the early diagnosis of IgAN. Due to the lack of non-invasive methods to predict IgAN, we developed a model with non-invasive clinical features to predict IgAN and used nomogram to calculate risk for individual patient.
As early as in 1995, the joint committee of the special study group on progressive glomerular diseases, the Ministry of Health and Welfare of Japan, and the Japanese Society of Nephrology reported serum IgA of more than 350 mg/dl in adults as one of the diagnostic criteria for IgA nephropathy [18] . Since then, several researchers have reported serum IgA as diagnostic and prognostic markers of IgAN [19,20] . Recently, studies have found that serum IgA/C3 ratio had better diagnostic and prognostic value for IgAN [8,21,22] . At the beginning of the model building, we also considered including serum IgA/C3 ratio in the nomogram model, but unfortunately it did not improve the diagnostic value of the model. So, in this nomogram, serum IgA was included instead of IgA/C3 ratio.
Galactose de cient IgA1 (Gd-IgA1) is a critical molecule in the pathogenesis of IgAN [23] . Galactose de ciency of O-linked glycans in the hinge region of IgA1 is the beginning of a sequence of events that may lead to renal injuries. The formation of galactosede cient IgA1 (Gd-IgA1) is the pivotal step of multi-hit pathogenesis of IgAN [24] . Meanwhile, Gd-IgA1 was suggested as a potential disease-speci c biomarker that predicts disease activity and prognosis [25,26] . However, due to the limitation of testing methods, Gd-IgA1 has not been used as a routine test in clinic. So, Gd-IgA1 was not considered to be included in this nomogram model.
In this study, serum IgG levels is signi cantly increased in IgAN patients compared with non-IgAN patients. Serum IgG concentration at baseline is a predictive marker for the prognosis of IgAN. Serum IgG level is an independent risk factor for poor outcomes in IgAN at the time of renal biopsy. Every 1 g/L decrease in serum IgG level was associated with a 1.74-fold increased risk of the incidence of composite renal outcomes [27] . Furthermore, IgG deposits in the mesangium and capillary loops predict adverse renal outcome in patients with IgAN [28] . However, serum IgG level did not distinguish IgAN well from primary GN and not improve the e ciency of the diagnostic model.
The strength of our study suggested by including few features and applying a non-invasive model we can obtain a good prediction power, thus minimize the tests needed for patients and could easily be adopted in clinical practice. In addition, we divided eligible patients into training and validation datasets using two independent set to evaluate the multivariate logistic regression model both internally and externally. In addition, the predictive power of the novel model was very high (AUC of 0.82) in independent test cohorts, suggesting its high diagnostic value.
However, this study had several limitations. Firstly, the model has been developed and validated in monocentric retrospective cohort of primary GN patients. So, multicenter retrospective cohort study is needed to verify the generalizability of the results in other primary GN patients. Second, the retrospective study design could not exclude the confounding effects. Further prospective cohort study needed to be done to increase the generalizability and deal with confounding effect.

Declarations
Ethics approval and consent to participate The protocol of using human blood samples in vitro and patients data was approved by the Institutional Biosafety Committees at Peking University Shenzhen Hospital. All studies were performed in accordance with the Declaration of Helsinki and complied with the guidelines of the O ce for Human Research Protections. All patients provided written informed consent before initiation of study.

Consent for publication
Not applicable.

Availability of data and materials
The datasets generated and analysed during the current study are not publicly available due to privacy of patients but are available from the corresponding author on reasonable request.

Competing interests
We declare that there are no competing interests related to this study.

Funding
Not applicable.