Deep Learning for Computer-aided Diagnosis of Pneumoconiosis

DOI: https://doi.org/10.21203/rs.3.rs-460896/v1

Abstract

Background: The diagnosis of pneumoconiosis relies primarily on chest radiographs and exhibits significant variability between physicians. Computer-aided diagnosis (CAD) can improve the accuracy and consistency of these diagnoses. However, CAD based on machine learning requires extensive human intervention and time-consuming training. As such, deep learning has become a popular tool for the development of CAD models. In this study, the clinical applicability of CAD based on deep learning was verified for pneumoconiosis patients.

Methods: Chest radiographs were collected from 5424 occupational health examiners who met the inclusion criteria. The data were divided into training, validation, and test sets. The CAD algorithm was then trained and applied to processing of the validation set, while the test set was used to evaluate diagnostic efficacy. Three junior and three senior physicians provided independent diagnoses using images from the test set and a comprehensive diagnosis for comparison with the CAD results. A receiver operating characteristic (ROC) curve was used to evaluate the diagnostic efficiency of the proposed CAD system. A McNemar test was used to evaluate diagnostic sensitivity and specificity for pneumoconiosis, both before and after the use of CAD. A kappa consistency test was used to evaluate the diagnostic consistency for both the algorithm and the clinicians.

Results: ROC results suggested the proposed CAD model achieved high accuracy in the diagnosis of pneumoconiosis, with a kappa value of 0.90. The sensitivity, specificity, and kappa values for the junior doctors increased from 0.86 to 0.98, 0.68 to 0.86, and 0.54 to 0.84, respectively (p<0.05), when CAD was applied. However, metrics for the senior doctors were not significantly different.

Conclusion: DL-based CAD can improve the diagnostic sensitivity, specificity, and consistency of pneumoconiosis diagnoses, particularly for junior physicians.

Introduction

Pneumoconiosis is the most widely distributed and harmful occupational disease in China [1, 2]. It is characterized by the diffuse fibrosis of lung tissue, caused by long-term inhalation of inorganic mineral dust and retention in the lungs during occupational activities. The cumulative number of cases has reached one million in china and it continues to increase at a rate of more than 20,000 new cases per year [2]. Dust inhaled through the respiratory tract can include silicon dioxide molecules, which cause pneumoconiosis alveolitis, focal lesions, nodular lesions, dusty fibrosis, massive fibrosis, and other pathological changes in the lungs. These lesions primarily manifest as small round structures with an irregular opacity, diffuse interstitial fibrosis of the lungs, and silicosis masses.

Conventional classification of pneumoconiosis is primarily based on the "International X-ray Classification of Pneumoconiosis" guidelines issued by the International Labor Organization in 2011 [3, 4]. The current standard in China is the "Diagnosis of Occupational Pneumoconiosis" (GBZ 70-2015), which is based on the correct interpretation of chest X-ray films using the profusion of small opacity, lung zone distributions, and pleural plaques to diagnose and classify pneumoconiosis [5, 6].

As the occurrence and progression of pneumoconiosis is a continuous process, the profusion of small opacity lesions in chest radiographs is also a sequential procedure. While professional training, comparisons of standard film, and improvements to imaging equipment and technology can increase the diagnostic accuracy of occupational physicians, inter-clinician differences remain high due to the subjective nature of image interpretation. Prior studies under the same external conditions have suggested that inconsistent judgments based on the shape, size, and number of small opacity lesions are the primary cause of inconsistent pneumoconiosis classification [7]. As such, increasing the objectivity, accuracy, and consistency of the diagnostic process is highly important.

Computer-aided diagnosis (CAD) provides an objective methodology for improving the interpretation of medical images. The rapid development of computer technology and medical imaging equipment has enabled the effective combination of artificial intelligence and image processing in recent years, which has led to improvements in detection and evaluation of disease severity [8]. Using computer-generated results as a reference, radiologists can draw more accurate conclusions for disease screenings and cancer risk assessments [912].

Several recent studies have investigated the application of CAD technology to the diagnosis of pneumoconiosis. New approaches have been developed for image preprocessing, characterization extraction, classifier selection, and optimization [1315]. However, acquiring the annotated training data used in conventional machine learning models can be both time- and cost-prohibitive, as lesions must be labeled manually by clinical experts. Subjective error and bias can also become problematic in this process.

The rapid development of artificial intelligence has led to broad applications of deep learning (DL) algorithms for medical image analysis. These models utilize a multi-layer network to facilitate the automated learning of implicit relationships within the data. The resulting characteristics are often more diverse and expressive, particularly in tumor imaging applications. DL can provide semi-supervised or unsupervised autonomous learning of a target image for classification tasks. It can also synthesize images with the same characteristics and imitate the independent learning and analysis capabilities of humans, thereby reducing the subjectivity of extracted features [1619]. At present, there is no relevant report on the evaluation of pneumoconiosis imaging diagnoses based on deep learning computer-aided diagnostic technology.

Convolutional neural networks (CNNs) and deep residual network (DRNs), an extension of CNNs, are common DL algorithms used for image classification [20]. These models provide the advantages of simplicity, practicality, and generalizability. Alternative models have also been proposed with varying convolutional layer quantities, including ResNet18, ResNet50, and ResNet101, which includes five network depths, a total of 100 convolutional layers, and a connection layer. These and other similar algorithms have been widely applied to image segmentation, detection, and recognition tasks [21, 22]. This study focused on assessing the application value of DL-based CAD technology in diagnosing pneumoconiosis.

Materials And Methods

This study was approved by the ethics committee of the Occupational Safety and Health Research Center (No. 2018003). Informed consent was waived in this study due to its retrospective nature.

Patients

Chest radiographs were acquired from occupational health monitoring medical examiners from January to September 2017. A total of 6020 patient (all male) images were include in the dataset, aged between 22 to 67 years (mean age: 48.62±12.1 y), with a 2–38 years of dust exposure history. The quality and diagnosis of chest radiographs were based on GBZ70-2015. Inclusion criteria required DR image quality to meet second level film standards. Exclusion criteria included signs of pneumonia, a tumor, tuberculosis, or other lung diseases that might affect the diagnosis. The number of control and study groups, as well as the stage criteria for study groups, are listed in Tables 1 and 2, respectively. The gold standard was produced by three experienced clinicians. Cases diagnosed as positive by two or more experts were classified as positive, and the remaining images were assumed to be negative. Pneumoconiosis in varying stages is shown in Figure 1.

To avoid the problem of over-fitting (common in machine learning), fifty images were selected using a random number table method to form positive and negative case groups. A total of 100 cases were used as the test set and the remaining positive and negative data were divided into training and verification sets using an 80/20 proportion. The training set was used for preliminary network training and the test set was used to evaluate the diagnostic efficiency of the CAD system. 

Table 1

Case composition of pneumoconiosis data set*

Data set

Negative

Positive data

 

Total

Stage Ⅰ

Stage Ⅱ

Stage Ⅲ

Total

Training set

3224

704

227

104

1035

4259

Verification set

806

176

57

26

259

1065

Test set

50

34

11

5

50

100

Total

4080

914

295

135

1344

5424

Note:*The grading standard of the pneumoconiosis is based on GBZ70-2015 promulgated by China

 

 Table 2

Definition of pneumoconiosis of different stages based on GBZ70-2015

Staging of pneumoconiosis

Overall profusion

Distribution of zone of lung

   StageⅠ

1

at least 2 areas

StageⅡ

2

exceed 4 areas

3

4 areas

Stage Ⅲ

-

have large opacity exceeding 2cm×1cm

3

exceeds 4 areas with small opacity clustering or merging large opacity

 

Establishment of a CAD model for pneumoconiosis

A DL algorithm based on DRN-Resnet101 was used to establish a CAD system for diagnosing pneumoconiosis. The included parameters were set as follows. The pre-training learning rate was 0.001 (using the SGD algorithm), which was decreased to 0.0001 after 6000 iterations. A total of 10000 epochs were used with a batch size of 64. All images were first desensitized, and a U-Net was used to extract lung fields on both sides, by removing invalid areas and adjusting the image size to a single 256 × 256 pixel matrix [23]. Images were then converted to a JPG format and the training data were input to the model. The verification set was used to determine the effectiveness of the model and gradually improve the accuracy of model output through continuous iterative optimization.

Image reading

Six physicians were divided into junior doctor (JD) and senior doctor (SD) groups based on their diagnostic experience. The three doctors in the SD group each had more than 10 years of experience in diagnosing pneumoconiosis. Those in the JD group each had 3–5 years of experience. These participants conducted independent diagnoses on the test set and a comprehensive diagnosis with reference to the CAD results, following the GBZ70-2015 guidelines. The interval between the two diagnoses was 10 days. Chest radiograph images were interpreted using a PACS workstation and displayed on Jusha 5 M medical monitors.

Statistical analysis

The MedCalc 15.2.2 software package (MedCalc Software bvba, Ostend, Belgium; http://www.medcalc.org; 2015) was used for statistical analysis. A receiver operating characteristic (ROC) curve was used to evaluate the diagnostic efficacy of the CAD system for diagnosing pneumoconiosis [24]. Area-under-curve (AUC), sensitivity, and specificity values were also calculated. A McNemar test was used to evaluate diagnostic sensitivity and specificity for clinicians with and without the use of CAD software. A value of p<0.05 was considered to be statistically significant. A Kappa test was included to evaluate consistency between the CAD results, the physician diagnosis (with and without CAD), and the gold standard. Consistency was poor, fair, and good for K values of 0.01–0.39, 0.4–0.74, and 0.75–1, respectively.

Results

CAD diagnostic efficiency for pneumoconiosis

The results of ROC analysis showed the AUC value for CAD-based pneumoconiosis was high (see Figure 2 and Table 3). Diagnostic sensitivity and specificity were also high and consistent with the gold standard. Classification results are shown in Tables 2 and 3, where it is evident that including CAD increased the diagnostic sensitivity of the JD group from 0.86 to 0.98 and the specificity from 0.68 to 0.86. These differences were statistically significant. Diagnostic sensitivity for the SD group increased from 0.94 to 0.98 and specificity increased from 0.90 to 0.94. This difference was not statistically significant. Diagnostic results with and without the use of CAD are provided in Tables 4 and 5.

Table 3

ROC results for independent CAD diagnoses

Cutoff value

AUC

Sensitivity

Specificity

0.53

 

Value

95% CI

Value

95% CI

Value

95% CI

0.99

0.97~1.00

0.94

0.82~0.98

0.96

0.85~0.99

Note:ROC: receiver operator characteristic; AUC: area under curve.


Table 4 

Diagnostic results with and without the use of CAD

JDs

GS

Total

JDs+CAD

GS

Total

-

+

-

+

-

34

7

41

-

43

1

44

+

16

43

59

+

7

49

56

Total

50

50

100

Total

50

50

100

 

SDs

GS

Total

SDs+CAD

GS

Total

-

+

-

+

-

45

3

48

-

47

1

48

+

5

47

52

+

3

49

52

Total

50

50

100

Total

50

50

100

Note:JD: junior doctor; SD: senior doctor; GS: the gold standard.

 

 Table 5

A comparison of sensitivity and specificity with and without the use of CAD

 

Sensitivity

Specificity

Without

With

 P

Without

With

P

JDs

0.86 (0.76-0.96)

0.98 (0.94-1.00)

0.03

0.68 (0.55-0.81)

0.86 (0.76-0.96)

0.02

SDs

0.94 (0.87-1.00)

0.98 (0.94-1.00)

0.50

0.90 (0.82-0.98)

0.94 (0.87-1.00)

0.68

Note:JD: junior doctor; SD: senior doctor

 

Kappa test results showed that including CAD increased the diagnostic consistency between the JD group and the gold standard from moderate to good, as the kappa value increased from 0.54 to 0.84. The consistency of the SD group improved slightly, with the kappa value increasing from 0.84 to 0.92 (see Table 6).

Table 6

A comparison of diagnostic consistency with and without the use of CAD

Doctors

Without

With

Kappa

95%CI

Kappa

95%CI

JDs/GS

0.54

0.37-0.70

0.84

0.73-0.94

SDs/GS

0.84

0.73-0.94

0.92

0.84-0.99

Note:JD: junior doctor; GS: the gold standard; SD: senior doctor

These results demonstrate that CAD can improve the consistency between diagnostic results and a gold standard, particularly for junior physicians whose performance approached the level of more senior physicians. Specificity improved significantly for the JD group (p<0.05) and their diagnostic consistency improved from moderate to good. Applying CAD also improved the sensitivity, specificity, and consistency for the SD group, though the results were not statistically significant.

Discussion

The diagnosis of pneumoconiosis has conventionally relied on manual interpretation, which can be subjective and inconsistent. This study made full use of the advantages offered by machine learning technology, developing a self-learning training model based on ResNet101 for the representation of pneumoconiosis in chest radiographs. ResNet101 can decompose a problem into multiple direct residual problems, using a residual vector coding scheme for image processing, without the need for additional network parameters or calculations. In this case, model training speed was increased, and classification accuracy was improved [20].

Pneumoconiosis is characterized in radiograph images by a diffuse distribution of low opacity objects of varying size. The shape, size, quantity, and distribution of these structures is difficult to accurately describe. As such, there is a ‘semantic gap’ between low-level image features and high-level medical terms [25]. This study used the entire lung field on both sides of the image as the research object, including the diffuse structures and some surrounding tissue. There was no need to perform complex feature extraction on specific object representations, thereby avoiding these semantic gaps. The algorithm extracted characteristic information for pneumoconiosis lesions and improved the accuracy of model classification. The collected data were used to train a CAD system, based on a ResNet101 model, for automated pneumoconiosis diagnosis.

Output results for 100 chest X-rays in the test group demonstrate the high diagnostic efficiency provided by the proposed CAD model. The AUC, sensitivity, specificity, and consistency were high when compared with a gold standard. In a future study, we will pursue various options for increasing the intuitive nature of predicted results [26]. For example, heat maps have been used in previous studies to visualize a portion of the test image [9, 12]. Pixels with larger values indicate a greater contribution to the result, which could help physicians to better understand the conclusions of automated diagnoses.

Our results showed that thproposed CAD system improved the overall sensitivity, specificity, and consistency of pneumoconiosis diagnoses for physicians with varying levels of experience. Specificity increased for all three physicians in the JD group and sensitivity improved for two of the three. Sensitivity measures the probability of correctly diagnosing positive cases in a test group, while specificity measures the probability of correctly diagnosing negative cases. High sensitivity could identify patients with pneumoconiosis, providing for earlier treatment and an improved quality of life. High specificity could screen out patients who do not require further examination, thereby reducing caseloads. Junior physicians are generally more prone to misdiagnosis due to a lack of experience. However, the proposed CAD system effectively improved both sensitivity and specificity for the JD group, producing an accuracy that was comparable to more experienced clinicians. Although the senior physicians were more accurate before including CAD, the proposed system increased their sensitivity and specificity, though the difference was not statistically significant.

The presented study does have certain limitations. For example, although the proposed CAD system, based on a deep residual neural network (ResNet101), can achieve high diagnostic accuracy and consistency for independent diagnoses, pneumoconiosis requires a comprehensive diagnosis. It relies not only on chest X-rays, but also on professional history, epidemiology, and clinical manifestations. The inclusion of qualitative factors like these in computer-based decision making is a topic that requires further research. In addition, the number of patients included in this study was small and the results exhibited some deviation. In a future work, the amount of data will be increased for further analysis. Finally, this study is only a preliminary investigation of whether CAD can diagnose pneumoconiosis. Research on different stages of pneumoconiosis should be conducted in the future.

Conclusion

In summary, the results presented in this study demonstrate that CAD can effectively improve the sensitivity, specificity, and consistency of pneumoconiosis diagnoses, particularly for junior physicians. As such, the proposed model could be a powerful new tool for reducing diagnostic subjectivity and inter-clinician variability.

Abbreviations

CAD: Computer-aided diagnosis

SD: Senior doctor

JD: Junior doctor

GS: Gold standard

ROC: Receiver operator characteristic

AUC: Area under curve

95% CI: 95% Confidence interval

Declarations

Ethics approval and consent to participate:

Ethics approval and consent to participate.

This study was approved by the Ethics Committee of the National Center for Occupational Safety and Health ,NHC,China (ID: 2018003).

Consent for publication:

Not applicable.

Availability of data and material:

The dataset for the current study is not publicly available, for the purpose of protecting the privacy of the participants. The data are available from the corresponding author upon reasonable request.

Competing interests:

Authors declare no conflicts of interest.

Funding:

This study was funded by the National Center for Occupational Safety and Health, NHC. (NO:2020025) for data collection and analysis.

Authors’ contributions

LZ and WH designed the research. ZW, QQ, JZ and CD performed the data collection. ZW,QQ and JZ performed the data analysis. ZW wrote the manuscript. LZ, JZ, QQ and CD revised the manuscript. All the authors have reviewed the final version of the manuscript and approved it for publication.

Acknowledgements:

The authors wish to thank Dr. Xiaopeng Wei,Yingjie Wang and Ruizhen Liu for their data collection supports, and Dr. Wei Wei for her Statistical advice.

References

  1. Wang B, Wu C, Kang L, Huang L, Pan W: What are the new challenges, goals, and tasks of occupational health in China's Thirteenth Five-Year Plan (13th FYP) period?Journal of occupational health 2018, 60(3):208-228.
  2. Occupational Lung Disease Group of Labor Hygiene and Occupational Diseases Branch of Chinese Preventive Medicine Association: Consensus of Chinese experts on pneumoconiosis treatment (2018). Journal of occupational health 2018, 35(8):677-689.
  3. Muszyńska-Graca M, Dąbkowska B, Brewczyński PZ: [Guidelines for the use of the International Classification of Radiographs of Pneumoconioses of the International Labour Office (ILO): Substantial changes in the currrent edition]. Medycyna pracy 2016, 67(6):833-837.
  4. IL: Guidelines for the use of the ILO international classification of radiographs of pneumoconioses.Geneva: International Labour Office 2011.
  5. Wang H, Li T: [A systematic review of digital radiography for the screening and recognition of pneumoconiosis]. Zhonghua lao dong wei sheng zhi ye bing za zhi = Zhonghua laodong weisheng zhiyebing zazhi = Chinese journal of industrial hygiene and occupational diseases 2014, 32(5):327-334.
  6. Cai ZC: [Comprehension of GBZ 70-2015 《Diagnosis of Occupational Pneumoconiosis》]. Zhonghua lao dong wei sheng zhi ye bing za zhi = Zhonghua laodong weisheng zhiyebing zazhi = Chinese journal of industrial hygiene and occupational diseases 2016, 34(11):866-867.
  7. Yu C, Qi F, Li L, Li DH: [A study on the variation in classifying chest radiographs for pneumoconiosis]. Zhonghua lao dong wei sheng zhi ye bing za zhi = Zhonghua laodong weisheng zhiyebing zazhi = Chinese journal of industrial hygiene and occupational diseases 2004, 22(5):336-339.
  8. Tang A, Tam R, Cadrin-Chênevert A, Guest W, Chong J, Barfett J, Chepelev L, Cairns R, Mitchell JR, Cicero MD et al: Canadian Association of Radiologists White Paper on Artificial Intelligence in Radiology. Canadian Association of Radiologists journal = Journal l'Association canadienne des radiologistes 2018, 69(2):120-135.
  9. Rajpurkar P, Irvin J, Bagul A, Ding D, Duan T, Mehta H, Yang B, Zhu K, Laird D, Ball RL: MURA: Large Dataset for Abnormality Detection in Musculoskeletal Radiographs. 2017.
  10. Rajpurkar P, Irvin J: Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing radiologists. 2018, 15(11):e1002686.
  11. Cicero M, Bilbily A, Colak E, Dowdell T, Gray B, Perampaladas K, Barfett J: Training and Validating a Deep Convolutional Neural Network for Computer-Aided Detection and Classification of Abnormalities on Frontal Chest Radiographs. Investigative radiology 2017, 52(5):281-287.
  12. Lakhani P, Sundaram B: Deep Learning at Chest Radiography: Automated Classification of Pulmonary Tuberculosis by Using Convolutional Neural Networks. Radiology 2017, 284(2):574-582.
  13. Okumura E, Kawashita I, Ishida T: Computerized Classification of Pneumoconiosis on Digital Chest Radiography Artificial Neural Network with Three Stages. Journal of digital imaging 2017, 30(4):413-426.
  14. Okumura E, Kawashita I, Ishida T: Computerized analysis of pneumoconiosis in digital chest radiography: effect of artificial neural network trained with power spectra. Journal of digital imaging 2011, 24(6):1126-1132.
  15. Yu P, Xu H, Zhu Y, Yang C, Sun X, Zhao J: An automatic computer-aided detection scheme for pneumoconiosis on digital chest radiographs. Journal of digital imaging 2011, 24(3):382-393.
  16. Jakhar D, Kaur I: Artificial intelligence, machine learning and deep learning: definitions and differences. 2020, 45(1):131-132.
  17. LeCun Y, Bengio Y, Hinton G: Deep learning. Nature 2015, 521(7553):436-444.
  18. van Ginneken B: Fifty years of computer analysis in chest imaging: rule-based, machine learning, deep learning. Radiological physics and technology 2017, 10(1):23-32.
  19. Giger ML: Machine Learning in Medical Imaging. Journal of the American College of Radiology : JACR 2018, 15(3 Pt B):512-520.
  20. He K, Zhang X, Ren S, Sun J: Deep Residual Learning for Image Recognition. In: arXiv e-prints. 2015: arXiv:1512.03385.
  21. Fulton LV, Dolezel D, Harrop J, Yan Y, Fulton CP: Classification of Alzheimer's Disease with and without Imagery using Gradient Boosted Machines and ResNet-50. Brain sciences 2019, 9(9).
  22. Yu X, Kang C, Guttery DS, Kadry S, Chen Y, Zhang YD: ResNet-SCDA-50 for breast abnormality classification. IEEE/ACM transactions on computational biology and bioinformatics 2020.
  23. Ronneberger O, Fischer P, Brox T: U-Net: Convolutional Networks for Biomedical Image Segmentation. 2015.
  24. Obuchowski NA, Bullen JA: Receiver operating characteristic (ROC) curves: review of methods with applications in diagnostic medicine. Physics in medicine and biology 2018, 63(7):07tr01.
  25. Tang J, Zha ZJ, Tao D, Chua TS: Semantic-gap-oriented active learning for multilabel image annotation. IEEE transactions on image processing : a publication of the IEEE Signal Processing Society 2012, 21(4):2354-2360.
  26. Hu B-G, Wang Y, Yang S, Qu H: How to Add Transparency to Artificial Neural Networks? (in Chinese). Pattern Recognition and Artificial Intelligence 2007, 20:pp. 72-83.