Clinical and pathological characteristics of the population
847 patients with IgAN were enrolled in this study(Table 1). We can conclude from the statistical data that the average age is 33 years old (13 to 73 years old), and males account for 43%. The average estimated glomerular filtration rate (eGFR) is 95.59 ml / min. The eGFR value of normal people is about 125 ml / min in the analysis of related data, so the average level of eGFR is in the normal range, but the variance is 95.83, indicating that the eGFR of patients with kidney disease There are large fluctuations. According to statistics, 96 patients were below 50 ml / min, and 33 patients were above 150 ml / min.The average value of urinary protein is 1.58 g / 24 h, the statistical difference of urinary protein is very large, the lowest is 0.0028 g / 24 h, and the highest is 16.331 g / 24 h. We divided the tonsil abnormalities into five grades (0, 1, 2, 3, 4) with an average score of 0.5. The average serum albumin content is 35.47 g / L, and the average blood urea nitrogen content is 9.20 mmol / L (normal value is 40–55 g / L).
Normal values of serum creatinine are different in different hospitals. Generally speaking, the standard value of normal serum creatinine is: 44–133 µmol / L. When the blood creatinine exceeds 133 µmol / L, it means that kidney damage has occurred, and renal insufficiency and renal failure have already occurred. In the statistical data, the average blood creatinine content is 101.11umol / L, the lowest is 3.35umol / L, and the highest is 1333.0umol / L. Therefore, the average renal function of the patients in the statistics is normal, but some patients have symptoms such as uremia. The mean and variance of other related factors are shown in Table 1.
Table 1
The demographic, clinical, laboratory data. Abbreviations: MAP: mean arterial pressure, eGFR: estimated glomerular filtration rate, BUN: blood urea nitrogen.
clinicopathological features
|
Effective statistics(N)
|
statistical data
|
renal insufficiency(%)
|
847
|
0.150 ± 0.360
|
age(years old)
|
847
|
33.156 ± 12.308
|
body weight(kg)
|
818
|
58.912 ± 11.154
|
systolic pressure(mmHg)
|
847
|
126.947 ± 20.357
|
diastolic pressure(mmHg)
|
847
|
80.170 ± 12.438
|
gross hematuria
|
847
|
0.145 ± 0.352
|
tonsil abnormalities
|
847
|
0.501 ± 0.801
|
serum creatinine(umol/L)
|
798
|
101.110 ± 101.674
|
eGFR
|
773
|
95.590 ± 95.834
|
BUN(mmol/L)
|
796
|
9.209 ± 31.479
|
serum uric acid (umol/L)
|
785
|
346.797 ± 113.470
|
blood triglyceride(mmol/L)
|
721
|
1.916 ± 1.787
|
total cholesterol(mmol/L)
|
721
|
5.638 ± 2.832
|
serum IgA(g/L)
|
665
|
2.562 ± 1.118
|
serum IgM(g/L)
|
664
|
1.891 ± 11.676
|
serum C3(g/L)
|
691
|
1.021 ± 0.314
|
serum C4(g/L)
|
673
|
0.243 ± 0.116
|
urine protein(g/24h)
|
734
|
1.586 ± 2.367
|
number of glomeruli under light microscope
|
847
|
15.949 ± 7.488
|
spherical hardening number
|
847
|
2.167 ± 2.725
|
spheroid hardening rate(%)
|
847
|
11.723 ± 16.724
|
segment hardening number
|
847
|
1.401 ± 1.668
|
segment hardening rate(%)
|
847
|
8.228 ± 11.874
|
total crescent formation
|
847
|
0.422 ± 1.052
|
cell crescent
|
847
|
0.129 ± 0.492
|
cell fiber crescent
|
847
|
0.207 ± 0.796
|
fiber crescent
|
847
|
0.086 ± 0.477
|
glomerular adhesion
|
847
|
0.194 ± 0.656
|
mesangial cells and mesangial matrix hyperplasia
|
847
|
2.375 ± 0.819
|
basement membrane condition
|
847
|
1.407 ± 2.402
|
capillary cavity opening(1,2,3)
|
847
|
1.815 ± 1.136
|
capillary endothelial hyperplasia
|
847
|
0.129 ± 0.339
|
tubule atrophy(0,1,2,3)
|
847
|
1.279 ± 0.835
|
interstitial fibrosis(0,1,2,3)
|
847
|
1.290 ± 0.830
|
whether there is inflammatory cell infiltration in renal interstitial
|
847
|
0.761 ± 0.426
|
Correlation analysis of various biochemical characteristics and IgAN
In order to study the correlation between biochemical characteristics and IgAN, this study intends to use a z-test to perform a bivariate analysis to compare the IgAN group with the healthy group. According to the statistical values in Table 2 below, the following 27 indicators have significant differences in the data of the sick and healthy groups: serum IgA, serum C3, urinary protein quantification, the number of glomeruli under light microscope, the number of sclerosis, spheric sclerosis rate, number of segment sclerosis, total number of crescent formation, number of cell crescent, number of cell fiber crescent, number of fiber crescent, number of glomerular adhesion, mesangial cell and mesangium stromal hyperplasia, basement membrane condition, capillary cavity opening, tubule atrophy, interstitial fibrosis, renal stromal infiltration of inflammatory cells, serum creatinine, tonsil abnormality, gross hematuria history, systolic blood pressure, age, renal insufficiency, eGFR, blood urea nitrogen and blood uric acid. That is, the above factors have an impact on IgAN. Whereas serum IgM, serum C4, segment sclerosis rate ,capillary endothelial hyperplasia, total cholesterol, body weight, gender, blood triglycerides and diastolic blood pressure were not significantly different between the IgAN group and the healthy group.
Table 2
the correlation between biochemical characteristics and IgAN. P value < 0.05 or z value > 1.96 was considered as significant, p value < 0.01 or z value > 2.58 was considered as very significant. Abbreviations:eGFR: estimated glomerular filtration rate, BUN: blood urea nitrogen.
clinicopathological features
|
z
|
p
|
significance
|
serum IgA(g/L)
|
4.624
|
0.000
|
very
|
serum C3(g/L)
|
4.399
|
0.000
|
very
|
urinary protein(g/24h)
|
5.763
|
0.000
|
very
|
number of glomeruli under light microscope
|
5.697
|
0.000
|
very
|
spherical hardening number
|
5.011
|
0.000
|
very
|
spheroid hardening rate(%)
|
6.466
|
0.000
|
very
|
segment hardening number
|
3.971
|
0.000
|
very
|
total crescent formation
|
7.945
|
0.000
|
very
|
cell crescent
|
4.707
|
0.000
|
very
|
cell fiber crescent
|
5.243
|
0.000
|
very
|
fiber crescent
|
4.713
|
0.000
|
very
|
glomerular adhesion
|
1.974
|
0.048
|
yes
|
mesangial cells and mesangial matrix hyperplasia
|
4.836
|
0.000
|
very
|
basement membrane condition
|
8.916
|
0.000
|
very
|
capillary cavity opening(1,2,3)
|
8.817
|
0.000
|
very
|
tubule atrophy(0,1,2,3)
|
6.561
|
0.000
|
very
|
interstitial fibrosis(0,1,2,3)
|
6.441
|
0.000
|
very
|
whether there is inflammatory cell infiltration in renal interstitial
|
3.225
|
0.001
|
very
|
serum creatinine(umol/L)
|
13.33
|
0.000
|
very
|
tonsil abnormalities
|
8.032
|
0.000
|
very
|
gross hematuria
|
6.269
|
0.000
|
very
|
systolic pressure(mmHg)
|
4.578
|
0.000
|
very
|
age(years old)
|
7.061
|
0.000
|
very
|
renal insufficiency
|
20.91
|
0.000
|
very
|
eGFR
|
4.868
|
0.000
|
very
|
BUN(mmol/L)
|
2.612
|
0.008
|
very
|
blood uric acid(umol/L)
|
9.276
|
0.000
|
very
|
Performance of the PCA model
We extracted the factors mentioned in Table2 that are obviously related to IgAN, and then eliminated all the computer error rows, so that the original total of 1395 pieces of data became only 1042 pieces of data, which were divided into IgAN and non-IgAN groups According to the results of kidney biopsy. To better diagnose IgAN, we use principal component analysis to extract the most important n principal components for the next step in constructing a judgment model. In order to determine the number of dimensionality reductions, we draw the number of dimensions and the variance chart of all components (Fig. 1). When the number of reduced dimensions is 20, the sum of the variances of all components is 90%, that is, about 10% of the information is lost. We can observe that when the dimension is equal to 3, the variance of all components is close to 100%, that is, about 2% of the information is lost, and this loss is within our ideal range.
The variance contribution rate of each principal component is 59.25%, 20.44%, 18.14%. The cumulative contribution rate is 97.83% (Fig. 2), so these 3 principal components can represent 97.83% of the information for the judgment of IgAN based on biochemical indicators. According to the indicators with the largest absolute value of each principal component coefficient, 3 representative indicators can be selected instead of 27 indicators. From the absolute value of the data in Table 3, the the biochemical parameters (also called variables) that determine the size of the PC1 are serum creatinine, blood uric acid and eGFR. Judging from the sign of the data, higher level of serum creatinine, blood uric acid and lower level of eGFR could promote the occurrence of IgAN. The biochemical parameters that determine the size of the remaining four main components are as follows: PC2: blood uric acid ; PC3: eGFR. The datas indicate that abnormalities in serum creatinine, blood uric acid and eGFR typically reflect an increased risk of IgAN occurrence.
Logistic regression analysis
Based on the sample data of the 27 indicators mentioned, the principal component analysis (PCA) was used to reduce the dimension of the index data. The dimension-reduced data was randomly divided into 80% of the data as the training set and 20% of the data as the test set. A logistic algorithm was used to construct a diagnosis model of kidney disease, and the results were as follows: ①The number of data was reduced to 3 dimensions. ②The accuracy, recall, and accuracy were used to evaluate the model. The results are shown in Fig. 2. A total of 209 cases were selected as the test set, including 124 cases from IgAN patients and 85 cases from control group. A total of 209 cases were selected as the test set, including 124 cases of IgA kidney patients and 85 cases from healthy control groups, of which 114 were correctly judged as IgAN with a recall rate of 91.93%. However, only 35 patients were correctly judged as non-IgA patients, so the overall accuracy of the model is 71.29%.The above data shows that the PCA predictive model has a good fitness.
Table 3
Component matrix. the larger the absolute value of the variable, the greater the contribution to the principal component. Abbreviations:eGFR: estimated glomerular filtration rate, BUN: blood urea nitrogen.
clinicopathological features
|
PC1
|
PC2
|
PC3
|
serum IgA(g/L)
|
0.000
|
0.000
|
-0.000
|
serum C3(g/L)
|
0.000
|
0.000
|
0.000
|
urinary protein quantification(g/24h)
|
0.004
|
0.003
|
-0.001
|
glomeruli under light microscope number
|
-0.007
|
-0.000
|
0.002
|
spherical hardening number
|
0.003
|
0.002
|
-0.002
|
spheroid hardening rate(%)
|
0.029
|
0.009
|
-0.012
|
segment hardening number
|
0.000
|
0.001
|
-0.001
|
total crescent formation
|
0.005
|
-0.003
|
0.001
|
total crescent formation
|
0.001
|
-0.000
|
-0.000
|
cell fiber crescent
|
0.002
|
-0.002
|
0.001
|
fiber crescent
|
0.001
|
-0.001
|
0.000
|
glomerular adhesion
|
0.000
|
0.000
|
0.000
|
mesangial cells and mesangial matrix hyperplasia
|
0.000
|
0.001
|
-0.000
|
basement membrane condition
|
0.004
|
0.001
|
-0.001
|
capillary cavity opening(1,2,3)
|
0.002
|
0.001
|
-0.001
|
tubule atrophy(0,1,2,3)
|
0.001
|
0.001
|
-0.001
|
interstitial fibrosis(0,1,2,3)
|
0.001
|
0.001
|
-0.001
|
whether there is inflammatory cell infiltration in renal interstitial
|
0.000
|
0.000
|
0.000
|
serum creatinine(umol/L)
|
0.881
|
-0.365
|
0.298
|
tonsil abnormalities
|
-0.000
|
0.000
|
0.000
|
gross hematuria
|
-0.000
|
-0.000
|
0.000
|
systolic pressure(mmHg)
|
0.036
|
0.014
|
-0.006
|
age(years old)
|
0.016
|
-0.009
|
-0.012
|
renal insufficiency
|
0.002
|
0.001
|
-0.000
|
eGFR
|
-0.221
|
0.240
|
0.944
|
BUN(mmol/L)
|
0.022
|
-0.010
|
0.041
|
blood uric acid(umol/L)
|
0.415
|
0.899
|
-0.130
|