This study was a post hoc analysis of results from our previous single-centre observational study [12]. We used images of 118 consecutive lesions in 114 patients who underwent ESD performed by a single endoscopist (Y.H.) from November 2016 to July 2019. Patient images and information were extracted from electronic medical records. ME-NBI was performed before treatment (on a day different from the treatment day). Before the examination, a soft hood (MB-46; Olympus Medical Systems) was mounted on the tip of the endoscope to enable the endoscopist to consistently fix the mucosa at approximately 2 mm. We first performed endoscopy with white-light imaging, following which ME-NBI was performed to diagnose the cancerous and non-cancerous areas. In particular, the demarcation line between the cancerous and non-cancerous areas was identified under low magnification, following which the target area was observed under high magnification. Finally, endoscopy was performed following indigo carmine spraying.
The inclusion criterion was as follows: cases with ME-NBI images at the utmost oral side of the cancerous area and the adjacent non-cancerous area (one image of each cancerous and non-cancerous area per case). The exclusion criteria included cases without ME-NBI images and those with unclear images owing to the presence of mucus, blood, or halation.
According to the gastric cancer treatment guidelines [13], the cancerous and non-cancerous regions were confirmed in all cases based on post-ESD pathological results, which are considered the gold standard. All images were selected by one instructor of the Japan Gastroenterological Endoscopy Society (JGES) (Y.H.). In addition, another instructor of the JGES (T.H.) confirmed that all images met the inclusion criteria. GIF-H260Z and GIF-H290Z (Olympus Medical Systems, Tokyo, Japan) were used for ME-NBI.
Endoscopists involved in diagnosing the images and the diagnostic method
Thirty-three endoscopists with specialized training in ME-NBI across 19 institutions participated in the diagnostic process. The VS classification system was used for diagnosis [7, 8]. The VS classification system was established based on diagnoses made by endoscopists with ME-NBI training at specialized facilities [7]. A previous study has reported that diagnostic performance is better among these endoscopists than among those without training [14]. Therefore, because diagnoses by endoscopists without specialized training in ME-NBI may not adequately reflect the accuracy of the ME-NBI diagnosis, endoscopists with specialized training in ME-NBI were selected.
Each endoscopist evaluated each image using the terms regular, irregular, absent, or inconclusive. If either MV or MS was defined as “irregular,” the lesion was diagnosed as cancerous; the lesion was not diagnosed as cancerous in the presence of other findings. Representative ME-NBI images in which the VS classification system was used are shown in Fig. 2.
In our previous study [12], the diagnostic results of cancer or non-cancer provided by each endoscopist were aggregated to calculate the diagnostic performance among all participating endoscopists. In the present study, the diagnostic results (regular, irregular, absent, or inconclusive) for the MS and/or MV of each image were collected using the original data of the previous study, and the diagnostic performance based on MS and/or MV was calculated for all images.
Evaluation criteria
This study was conducted in accordance with the Standards for the Reporting of Diagnostic Accuracy Studies 2015 guidelines [15]. For each image, the classification (regular, irregular, absent, or inconclusive) with the greatest frequency of response among the endoscopists was regarded as the final diagnosis. When multiple classifications exhibited the maximum number of answers, the final diagnosis was regarded as inconclusive. In contrast, if the agreement rate among the endoscopists for each image was low, the reliability of the diagnosis was deemed low and generalization was considered difficult. Therefore, we also calculated the diagnostic agreement rate for each image, which was defined as the ratio of the maximum number of answers for a given classification to all answers.
We calculated the median and IQR of the diagnostic agreement rate for all images, which were confirmed based on MV and MS. MV and MS diagnoses were then aggregated for each of the cancer and non-cancer images.
Based on the aggregated results, we calculated the diagnostic accuracy, sensitivity, specificity, PPV, and NPV for the diagnosis based on MV alone (irregular pattern indicative of cancer), MS alone (irregular pattern indicative of cancer), and the combination of MS and MV (cancer diagnosed if either had an irregular pattern). Subsequently, we compared the diagnostic performance between MV alone and MS alone and between MV alone and the combination of MS and MV. The classification of “inconclusive” was considered an incorrect answer.
We defined accuracy as follows: (number of correctly diagnosed cancerous lesions among actual cancerous lesions plus number of correctly diagnosed non-cancerous tissues among actual non-cancerous tissues)/total number of images.
In addition, to identify the factors that contributed to the additive effects in cancer images, we compared lesion characteristics (location, macroscopic type, tumour diameter, depth, ulcerative findings, histological type, history of Helicobacter pylori infection) between the cases that were correctly diagnosed using MV alone and those that were misdiagnosed using MV and correctly diagnosed using MS. The cutoff values for tumour diameter were determined with reference to the median value for all tumours.
H. pylori-uninfected (HPU) cases were defined as follows: 1) no prior H. pylori eradication, 2) negative results on the urea breath test (UBIT, Otsuka, Tokushima, Japan), 3) negative results for the H. pylori antibody (H. pylori antibody Ⅱ, EIKEN Co., Ltd.), 4) negative pepsinogen (PG) test results (positive cutoff level: PGI ≤70 ng/mL and PGI/II ratio ≤3), 5) endoscopically confirmed positive regular arrangement of collecting venules in the lower gastric body [16], and 6) histologically confirmed HPU and negative inflammatory cell infiltration activity based on the updated Sydney system [17]. Cases that did not meet these criteria were considered to have a history of H. pylori infection.
Inclusion criteria for the H. pylori-infected with eradication group were as follows: negative for H. pylori antibodies (H. pylori antibody Ⅱ, EIKEN Co., Ltd.) or a negative C urea breath test result (UBIT, Otsuka) if they underwent H. pylori eradication at our or another hospital; confirmed negative result for urea breath test performed at least 4 weeks after initiating H. pylori eradication if they were positive for H. pylori antibodies or had a positive urea breath test result at the first examination at our hospital. The H. pylori infection without eradication group did not meet these inclusion criteria.
Ethical considerations
This study was approved by the institutional review board of the Cancer Institute Hospital in Tokyo, Japan (approval no. 2019-1032) and was performed in accordance with the principles embodied in the Declaration of Helsinki and its later amendments. While recording the data for this study, all personal identifying information was removed. Informed consent was obtained from each patient for the use of pathological specimens and imaging data for research purposes.
Statistical analysis
The median and IQR were used when calculating the diagnostic agreement rate for all images. McNemar tests with 95% CIs were used to compare the diagnostic accuracy, sensitivity, and specificity among diagnoses made based on MV alone, MS alone, and the combination of MS and MV. Fisher’s exact tests with 95% CIs were used to compare PPV and NPV among the three diagnostic methods.
The statistical significance level was set at P < 0.05/2 using the Bonferroni method for two pairwise comparisons in the same population (MV alone vs. MS alone, and MV alone vs. the combination of MS and MV). Fisher’s exact test was used when comparing lesion characteristics between cases correctly diagnosed based on MV alone and those correctly diagnosed based on the combination of MS and MV, with the level of statistical significance set at P < 0.05. JMP v13.2 (SAS® Institute, Cary, NC, USA) was used to perform the analyses.