Comparison of Three Prediction Rules for Assessing the Risk of Gastric Cancer in Chinese Health Examination Populations

Background The current consensus regarding gastric cancer screening in China recommends Li’s Scoring System to assess the risk of gastric cancer. Objectives To compare the predictive capacity of three prediction rules: the ABC method, the Scoring System from the Japan Public Health Center (JPHC), and Li’s Scoring System in Chinese health examination populations. Methods We retrospectively evaluated 1,436 patients undergoing gastroscopy. The patients were classied into three groups (low, medium and high risk) according to each rule. The predictive capacity of three rules to assess the risk of gastric cancer was compared. Results A total of 28 (1.95%) cases with gastric cancer were detected. The Scoring System from JPHC and Li’s Scoring System performed similarly, and the areas under the receiver-operating characteristic (ROC) curves (AUC) were 0.745 (95%CI: 0.722–0.767), and 0.739 (95%CI: 0.715–0.761), respectively. And the AUC for the ABC method was 0.642 (95%CI: 0.617–0.667), signicantly lower than that for the Scoring System from JPHC (p < 0.05).Li’s Scoring System had the highest sensitivity, signicantly higher than that of the Scoring System from JPHC (85.71% vs 53.57%, p<0.05). Larger proportions of low-risk patients were diagnosed as gastric cancer by the Scoring System from JPHC (1.99%) and ABC method (0.99%) than by Li’s Scoring System (0.55%). Conclusions The Scoring System from JPHC and Li’s Scoring System have a similar performance in assessing the risk of gastric cancer, but Li’s Scoring System is more effective in Chinese health examination populations because of the lowest probability of missed diagnosis.


Background
Gastric cancer is a common malignant tumor in China. In 2015, an estimated 679,100 Chinese were newly diagnosed with gastric cancer, and 498,000 people in China ultimately died from the disease, accounting for approximately 50% of new cases and deaths worldwide [1]. The prognosis for gastric cancer is closely related to the stage of the disease at the time of diagnosis [2][3][4]. At present, the detection rate of early gastric cancer in China is less than 10%, far less than in Korea and Japan, where national cancer screening programs have been launched [5][6][7][8][9].
The current screening guidelines for gastric cancer in China suggest that beginning gastric cancer screening at the age of 40 for all "high risk" populations: viz., those with Helicobacter pylori (H. pylori) infections; a positive family history of gastric cancer; precancerous diseases of gastric cancer such as chronic atrophic gastritis, gastric ulcers, and gastric polyps; risk factors for gastric cancer such as smoking, heavy alcohol drinking, and high salt diets; or those residing in high incidence regions for more than 3 years [10]. Upper gastrointestinal endoscopies in combination with mucosa biopsies for histological examinations are currently regarded as the gold standard for diagnosing gastric cancer. However, "high risk" populations are estimated to exceed 300 million people in China [11]. It is di cult to carry out a mass screening program for entire populations, owing to high costs and insu cient endoscopists and equipment. Consequently, the current status of diagnosis and treatment of gastric cancer can be improved by stratifying the risk of gastric cancer and identifying individuals with a high risk for gastroscopies.
To characterize the risk of gastric cancer, a number of prediction rules that incorporate multiple risk factors have been developed. Miki developed the ABC method (Table 1), which combines H. pylori serology with pepsinogen (PG) level, and screened individuals were classi ed into four groups based on the results: Group A, Group B, Group C, and Group D [12]. The detection rate for gastric cancer is highest in the order of group D, C, B, and A, and thus the ABC method can identify individuals at a higher risk for future gastric cancer development [13][14]. In 2016, a JPHC-based prospective study (Table1) developed a scoring system (the Scoring System from JPHC) for estimating the cumulative probability of gastric cancer occurrence based on a sample of 19,028 individuals, combining age, sex, lifestyle habits (smoking status, consumption of high-sodium food), family history of gastric cancer, as well as biological information [serum anti-H. pylori IgG titers, PG levels] [15]. In recent years, a nationwide multicenter crosssectional study [16] conducted by the China National Clinical Research Center for Digestive Diseases (Shanghai) found that age, sex, pepsinogen I/pepsinogen II ratio (PGR), gastrin-17(G-17), and H. pylori antibodies were signi cantly associated with the risk of gastric cancer, and developed Li's Scoring System to predict the risk. The study reported that the prediction rule was an accurate, practical, and costeffective prescreening tool for gastric cancer in Chinese high risk populations with good discrimination (AUC of 0.76), high sensitivity (70.8%), and good calibration (0.629), reducing the need for gastroscopies by 66.7%.
However, it is unclear which prediction rule performs best at identifying populations with a higher risk of gastric cancer as candidates for gastroscopies among Chinese health examination populations. Therefore, we aimed to compare the predictive ability of the ABC method, the Scoring System from JPHC, and Li's Scoring System in Chinese health examination populations.

Patients
We retrospectively reviewed patients who had come to the physical examination center for a health examination and undergone upper gastrointestinal tract endoscopy at Songjiang Hospital A liated to Shanghai Jiaotong University School of Medicine between March 2019 and September 2020. Serum H. pylori antibody, G-17, and PG I, and PG II data were available. Inclusion criteria: male participants aged 40-79 years and females aged 50-79 years. Exclusion criteria: (1) patients who had received a gastroscopy within 1 year; (2) those with a history of gastrectomy; (3) those treated with a proton pump inhibitor or histamine-2 receptor antagonist within 2 weeks; (4) those with the ischemic anemia within 6 months, gastrointestinal bleeding within 12 months, weight loss, frequent diarrhea, dysphagia or choking, or an abdominal mass; (5) those with a medical history of esophageal cancer, gastric cancer, colorectal cancer, in ammatory bowel disease, or with other organ malignancies such as breast, ovary, uterus, and urinary system malignancies; (6) those highly suspected of having a tumor from image inspections, tumor markers, or other examinations; (7) those with serious mental disorders or serious heart, lung, or renal dysfunction. The study was approved by the Ethics Committee at Songjiang Hospital, Shanghai Jiaotong University School of Medicine (201803), and conformed to the ethical guidelines of the 1975 Helsinki Declaration. Written informed consent was obtained from all subjects.

Serological tests
Fasting peripheral venous blood samples were collected from eligible subjects during a 10-hour no smoking and no alcohol period to test PGI, PGII, G-17, and H. pylori antibodies. After centrifugation, serum aliquots were stored at 2-8°C and immediately assayed within 3 hours. Serum PGI, PGII, G-17, and H. pylori antibodies were measured using the Gastro-Panel Mucosal Serological Test Kit (Helicobacter Pylori Antibody Classi cation Assay Kit: Shenzhen Blot Biotech Co., Ltd, PGI, PGII, and G-17 Test Kit: Biohit Healthcare (Hefei) Co., Ltd).

Gastroscopic and histological examinations
Gastroscopies were performed by two xed endoscopists who had performed gastroscopies for more than 10 years, using Olympus GIF-H260Z series electronic gastroscopies (Olympus Corporation, Tokyo, Japan). Patients without obvious lesions underwent a biopsy of the distal greater curvature and distal lesser curvature of the gastric antrum, gastric angulus, and the middle part of the small and large curved sides of the gastric body. Patients with obvious lesions underwent biopsies at those locations. All biopsy specimens were placed on lter paper xed in a 10% formalin solution, and sent to the Department of Pathology of our hospital for histopathological diagnosis. Final diagnosis of each histological specimen was independently made by two experienced pathologists.

Treatment and nal diagnosis of early cancer cases
We performed magnifying endoscopies with narrow band imaging (MENBI) and chromoendoscopies on patients diagnosed as inde nite intraepithelial neoplasms, low grade intraepithelial neoplasms (LGIN), and high grade intraepithelial neoplasms (HGIN), using Olympus GIF-H260Z series electronic gastroscopies (Olympus Corporation, Tokyo, Japan). Patients identi ed as having early cancer underwent endoscopic submucosal dissection (ESD) or surgery. The nal diagnoses were based on the pathology of the endoscopy and postoperative pathology. According to the consensus on screening, endoscopic diagnosis and treatment of early gastric cancer in China [11], HGIN was equivalent to severe dysplasia and carcinoma in situ. On the basis of the 2010 Japanese standard, HGIN was equivalent to welldifferentiated gastric tubular adenocarcinomas (tub1) or moderately differentiated tubular adenocarcinoma (tub2). Consequently, we classi ed HGIN as early cancer.

Statistical analysis
Continuous variables were presented as the mean with standard deviation, and were compared using Student's t-test. Categorical variables were presented as the sample number with corresponding percent, and were compared by using the χ 2 test. The discrimination of the three prediction rules was assessed by sensitivity, speci city, positive predictive value (PPV), negative predictive value (NPV), ROC curves, and AUC. To compare the AUC, we used DeLong [17]. All statistical analyses were performed using SPSS 19.0, except for the comparison of the paired ROC curves and AUC, for which we used MedCalc 12.0.

Risk category of the three prediction rules in study populations
The results with regard to risk category are shown in Table 3. For the ABC method, 56.48% of the participants were categorized into the low risk group, whereas 43.45%, and 0.07% were in the medium risk and high risk groups, respectively. For the Scoring System from JPHC, 76.32% of the participants were categorized into the low risk group, whereas 18.87% and 4.81% were in the medium risk and high risk groups, respectively. For Li's Scoring System, 37.95% of the participants were categorized into the low risk group, whereas 40.74% and 21.321% were in the medium risk and high risk groups, respectively.

Detection of gastric cancer in study populations
The detection rate of gastric cancer in each risk category of the three prediction rules is shown in Table 3.
The proportion of gastric cancer in low risk groups was highest for the Scoring System from JPHC (1.19%), followed by the ABC method (0.99%), and Li's Scoring System (0.55%). Among the 13 cases of gastric cancer in the low risk group for the Scoring System from JPHC, 11 cases (84.62%) were 60-79years-old (60-79 years vs. 40-59 years, p < 0.05), and the detection rate was similar between men and women. In the low risk group for the ABC method, all cases with gastric cancer were men aged 60-79 years. On the contrary, in the low risk group for Li's Scoring System, all cases with gastric cancer were female. The detection rates of gastric cancer in the medium risk and high risk groups for the ABC method were 3.21% and 0, respectively; for the Scoring System from JPHC, they were 2.95% and 10.14%, respectively; and for Li's Scoring System, they were 1.89% and 4.58%, respectively. Thus, the detection rate of gastric cancer in the medium risk and high risk groups was signi cantly higher than that of the low risk group for the three prediction rules (p 1 < 0.05, p 2 < 0.001, p 3 < 0.05). The detection rate for early gastric cancer in the medium risk and high risk groups was 100% for both the Scoring System from JPHC and Li's Scoring System, and it was 90% for the ABC method.
Performance of the prediction rules ROC curves for each prediction rule are depicted in Figure 1. The AUC of the Scoring System from JPHC was highest at 0.745 (0.722-0.767), followed by Li's Scoring System at 0.739 (0.715-0.761), and the ABC method, with an AUC of 0.642 (0.617-0.667). The AUC for the Scoring System from JPHC was signi cantly higher than for the ABC method (p < 0.05). No difference was found between the AUC of the Scoring System from JPHC and Li's Scoring System (p > 0.05). When categorized as low risk versus higher risk (medium and high risk groups), Li's Scoring System had the highest sensitivity (85.71%), and the Scoring System from JPHC had the lowest (53.57%). This difference in sensitivity was statistically signi cant (p < 0.05).All differences in speci city were highly statistically signi cant (all p-values < 0.001). All three prediction rules had high NPV but low PPV (Table 4).

Discussion
Usually, symptoms of gastric cancer are absent or nonspeci c in the early stages of gastric cancer, and they are easily ignored [18]. A prospective study in China involving 102,665 subjects found that 48% of patients with gastrointestinal malignancy had no indicatory features, and the detection rate of gastrointestinal tumors among those patients was 2.5% [19]. Thus, such indicatory features alone cannot predict gastrointestinal tumors, and regular endoscopic screening of high-risk patients is key for early diagnosis of gastric cancer. However, the adaptability of the existing risk prediction rules to Chinese health examination populations is not yet clear. Therefore, in this study, we compared the e cacy of three prediction rules: the ABC method, the Scoring System from JPHC, and Li's Scoring System.
In our analysis, the comparison of three prediction rules in 1,436 participants showed that the detection rate of gastric cancer was signi cantly higher in the medium risk and high risk groups than that in the low risk group (p < 0.05, p 2 < 0.001, p 3 < 0.05) ( Table 3), and the detection rate of early gastric cancer in the medium risk and high risk groups reached 90%, indicating that the three prediction rules can improve the detection rate of early gastric cancer. We noticed that all cases of advanced gastric cancer were diffusetype lesions, which were detected in the low risk group for both the Scoring System from JPHC and Li's Scoring System. Previous studies reported that these scoring systems did not perform well for diffusetype gastric cancer [15][16].
One valuable role of prediction rules is to identify patients as low risk in order to avoid further testing and save medical resources. An ideal risk prediction rule would identify low risk patients of considerable size with a low detection rate of gastric cancer, to minimize missed diagnoses. However, it is worth noting that a number of studies mentioned that gastric cancer and precancerous lesions were detected in the low risk groups of several prediction rules. [16,[20][21]. In our study, the Scoring System from JPHC identi ed the most patients as low risk (1096, 76.32%), followed by the ABC method (811, 56.48%), and Li's Scoring System (37, 37.95%). The missed diagnosis rate of gastric cancer was the highest in the Scoring System from JPHC (1.19%), and 84.62% of the patients aged 60-79 years. Therefore, missed diagnoses should be especially considered for elderly patients when using this scoring system. The missed diagnosis rate of the ABC method was 0.99%, and all cases were males, which should be taken seriously. Furthermore, patients in group A with a history of H. pylori eradication may need gastroscopy. Miura et al. reported that about 20% patients with gastric cancer have a history of H. pylori eradication, and they have normal PG levels and negative anti-H. pylori IgG titres [22]. Although Li's Scoring System had the lowest rate of missed diagnosis (0.55%), all missed patients were females. Thus, it is still necessary to consider the missed diagnosis of female patients due to the assigned 0 points for females in Li's scoring system.
In this study, the ABC method showed the least predictive capacity for gastric cancer with the lowest AUC (0.642, 95% CI: 0.617-0.667), which was signi cantly lower than that of the Scoring System from JPHC (p < 0.05). Moreover, the speci city was signi cantly lower than that of the Scoring System from JPHC (p < 0.001) and Li's Scoring System (p < 0.001). Consistent with previous research, the ABC method for gastric cancer screening did not work well in the Chinese populations, with lower sensitivity and speci city than those in nonChinese studies [23][24]. This may be related to the fact that the ABC method was developed in Japanese populations, and the optimal PG and G-17 cut-off values may not be applicable to screening for gastric cancer in China. Furthermore, the ABC method only includes the results of serological indicators without demographic factors (e.g. age, sex, etc.) and lifestyle habits (e.g. smoking status, consumption of alcohol and high-sodium food, etc.), which may affect its predictive ability.
The Scoring System from JPHC had the highest AUC (0.745, 95% CI: 0.722-0.767), followed by Li's Scoring System (0.739, 95% CI: 0.715-0.761), and these two prediction rules performed similarly. However, the Scoring System from JPHC had the largest possibility of missed diagnosis of gastric cancer with the lowest sensitivity(53.57%) and NPV(98.90%) which may be possibly explained by the weight setting of indicators (e.g. H. pylori infection) and regional differences. The Scoring System from JPHC was developed based on a large group of Japanese participants and included a number of variables, such as demographic factors, lifestyle, and the risk categories of the ABC method. Furthermore, the optimal cut-off values of PG and G-17 may be inappropriate for screening in China [15]. The scoring system did not nd any substantial difference between the risk of individuals in categories C and D of the ABC method, which were similar to several published studies [25][26]. Nevertheless, the scoring system emphasized the important effect of smoking and high sodium food consumption on the occurrence of gastric cancer, and the quanti cation of these lifestyle-related risk factors might provide an incentive for adopting healthier lifestyles, particularly for high-risk individuals [27].
Li's Scoring System performed well in predicting the risk of gastric cancer and having the lowest probability of missed diagnosis with the modest AUC (0.739), and highest sensitivity (85.71%) and NPV (99.50%). These results are similar to the originally published values [16]. The prediction rule was developed based on a multicenter, large-sample Chinese population, and the cut-off values of PG and G-17 were more in line with screening for gastric cancer in China. For example, the cut-off value of PGR in Li's Scoring System and the Scoring System from JPHC is 3.89 and 3, respectively. However, it is worth noting that the e cacy of Li's Scoring System should be further veri ed in order to improve discrimination and reduce missed diagnosis of females. H. pylori infection has been considered as the key cause of intestinal gastric cancer, and the e cacy of screening may be improved by distinguishing two types of H. pylori strains (strains expressing CagA and VacA, and strains without CagA and VacA).
Our study had potential limitations. First, we studied the ABC method, the Scoring System from JPHC, and Li's Scoring System, although several other prediction rules are available [28,17]. Second, we only compared the e cacy of three prediction rules, and did not analyze possible risk factors (e.g., obesity and the consumption of fried food, etc.), which could be done by gathering more cases in future studies. Finally, the sample size was small. We only included 1436 health examination populations, suggesting Chinese are poorly informed about gastric cancer screening and the education on this topic is needed.

Conclusion
In conclusion, we report that Li's Scoring System is more useful in Chinese health examination populations because of its lowest possibility of missed diagnoses of gastric cancer, though no signi cant differences are found between the Scoring System from JPHC and Li's Scoring System in predicting the risk of gastric cancer. However, missed diagnoses among female patients for Li's Scoring System cannot be ignored, and future studies are needed to validate its e cacy and to explore a screening process that is more practical in China.

Consent for publication
All named authors have agreed to the paper publication.
Availability of data and materials Currently, the data are not yet openly available. The study group welcomes potential collaboration to maximize use of existing resources. The datasets analysed in this study are available from the corresponding author on reasonable request.

Competing interests
The authors have declared no con icts of interest.    JPHC, Japan Public Health Center.
1,2,3, the detection rate of gastric cancer of the high risk group was signi cantly higher than that of the low risk group in three prediction rules (p 1 < 0.05, p 2 < 0.001, p 3 < 0.05).
4, In the low risk group for the Scoring System from JPHC, the detection rate of gastric cancer of patients aged 60-79 years was signi cantly higher than that of those aged 50-59 years (p < 0.05). The AUC for the Scoring System from JPHC was signi cantly higher than for the ABC method (p < 0.05).