The “Ground-Glass” Mimicker in The Pandemic: A Novel Radiomics-Based Machine Learning Model Differentiates COVID-19 Pneumonia from Acute Non-COVID-19 Lung Disease

Ground-Glass Opacities (GGOs) are a non-specic CT nding observed in the early phase of COVID-19 pneumonia. However, GGOs are also seen in other acute interstitial and alveolar lung diseases, thus making the differential diagnosis a diagnostic challenge. In this poof-of-concept study, we aimed to differentiate COVID-19 pneumonia presenting with GGOs from acute non-COVID-19 lung disease using a novel radiomic-based model in patients who underwent a high-resolution CT (HRCT) scan at hospital admission during the rst pandemic peak in Italy. HRCT scans of 28 RT-PCR diagnosed COVID-19 pneumonia (COVID) and 30 acute non-COVID-lung disease (nCOVID) were retrospectively included. All patients showed GGOs as the predominant CT pattern. Two readers, blinded to the nal diagnosis, independently segmented GGOs on CT scans by using a semi-automated approach, and radiomic features were extracted from segmented images. Partial least square (PLS) regression was used as the multivariate machine-learning algorithm. A leave-one-out nested cross-validation was implemented to optimize the hyperparameter of PLS and to assess the model generalization. The diagnostic performance of the radiomic model to differentiate between COVID and nCOVID lung disease was assessed through receiver operating characteristic (ROC) analysis. The radiomics-based machine learning model differentiated COVID and nCOVID with an AUC = 0.868 (p = 4.2·10 − 7 ). After a careful prospective evaluation in larger multicentric studies, it may help radiologists to rule out COVID-19 pneumonia thus improving the COVID-19 triaging in epidemic areas.


Introduction
COVID-19 is a viral infectious disease caused by SARS-CoV-2, which has gradually spread worldwide since December 2019 (1,2). The clinical presentation is extremely varied and ranges from asymptomatic or paucisymptomatic infection to severe pneumonia with respiratory failure (3,4). COVID-19 pneumonia is caused by partial lling of airspaces, interstitial thickening, partial collapse of alveoli, increased capillary blood volume, or a combination of these elements (5). This histologic scenario re ects different CT patterns of presentation, including GGOs with or without superimposed septal thickening ("crazy paving" pattern), and parenchymal consolidations (6)(7)(8)(9)(10). However, GGOs are a non-speci c CT nding.
In fact, they can be found in the early, exudative phase of COVID-19 pneumonia as well as in interstitial and alveolar diseases, such as pulmonary edema, alveolar hemorrhage, infectious pneumonia, hypersensitivity pneumonia, and others acute lung diseases (11)(12)(13). Although the context of the current pandemic and the previous CT ndings can be indicative of COVID-19 pneumonia, the differential diagnosis remains a challenge (14,15). Integration with clinical data, laboratory tests, and lung biopsy may be necessary when the initial diagnosis is inconclusive (16). Radiomics is a recently developed method that allows for extracting large amounts of quantitative data features from medical images (17). A typical radiomic analysis includes evaluating the size, shape, and textural features containing spatial information on pixel or voxel intensity distribution and patterns. Radiomic features can be further integrated into machine learning models with the aim to improve diagnosis and patient management. This approach was recently proposed in oncology with promising results (18)(19)(20)(21). For example, a framework combining radiomics and deep learning recently predicted high-grade lung adenocarcinoma (19). Moreover, a radiomic model was demonstrated to predict the tumor invasiveness of pulmonary adenocarcinomas appearing as ground-glass nodules (18,21). Radiomics-based models were recently proposed to improve the diagnosis of COVID-19 on chest CT images (22)(23)(24). To the best of our knowledge, no studies investigated the possibility of discriminating the GGOs due to COVID-19 pneumonia from those due to non-COVID-19 acute lung disease.
In this study, we aimed to differentiate on CT scans COVID-19 pneumonia presenting with GGOs, acquired at hospital admission during the rst pandemic peak, from acute non-COVID-19 lung disease using a novel radiomic-based model.

Materials And Methods
Study Population. The study received formal approval from the Ethical Committee of the University G. D'Annunzio of Chieti-Pescara, Italy; informed consent was waived by the same ethics committee that approved the study (Comitato Etico per la Ricerca Biomedica delle Province di Chieti e Pescara e dell'Università degli Studi "G. d'Annunzio" di Chieti e Pescara, Italy). The study was conducted according to ethical principles laid down by the latest version of the Declaration of Helsinki. We retrospectively included a total of 120 consecutive patients diagnosed with SARS-CoV-2 infection based on RT-PCR who underwent a clinically indicated high-resolution chest CT (HRCT) between March 2020 and April 2020. Patients were included if they met all the following criteria: (a) GGO as predominant feature on chest CT scans, (b) baseline HRCT performed at hospital admission. Another set of 310 patients (nCOVID) with clinically indicated HRCT for acute respiratory disease performed between August 2019 and April 2020 was retrospectively enrolled in the study (nCOVID). For this second set, patients were included if they met all the following criteria: (a) GGO as predominant feature on chest CT scans, (b) availability of nal diagnosis (clinical, laboratory, or pathology). In the rst set (COVID), we excluded 92 patients: 14 had severe respiratory artefacts, 92 had a non-predominant GGO pattern. In the second set (nCOVID) we excluded 280 patients: 32 had severe respiratory artefacts, 210 had a non-predominant GGO alteration and 38 were treated in another center and the nal diagnosis was not available. The nal study population was composed of 28 COVID and 30 nCOVID for a total of 58 patients (Fig. 1).
CT Protocol. Non-enhanced chest CT scans were performed in a supine position, during inspiratory breath-hold, from the apex to the lung bases, with a 128-slice multi-detector CT device (Somatom De nition AS, Siemens Healthineers, Germany). The eld of view (FOV) ranged between 35-40 cm according to the body size. The electronic window values were amplitude (W) 1200-1600 UH and window or center level (L) between − 600 and − 750 UH. The main scan parameters were: tube voltage = 120 kVp, automatic tube current modulation (30-70 mAs), pitch = 0.99-1.22 mm, matrix = 512 × 512. The images were reconstructed with a slice thickness of 0.625-1.250 mm with the same increment with a high spatial frequency reconstruction algorithm (B50, I50). Radiomics Analysis. A whole-volume semi-automated GGOs delineation was independently performed by two senior radiology residents (C.V. and M.V.) using an open-source medical image computing platform, 3DSlicer Version 4.8 (www.3dslicer.org) (Fig. 2a). The GGO threshold was manually set between − 1350 and − 700 HU using the "threshold-effect" tool (9,25,26). If necessary, the segmentation was further manually corrected by each reader in order to exclude automated segmented pixels beyond the GGOs.
Moreover, the lungs were automatically extracted via Convolutional Neural Network (CNN) algorithms to create binary mask (27). Then, a logical "and", between these masks and the segmentations obtained by the radiology residents, was performed (using "3dcalc") to exclude automated segmented pixels beyond the lungs, thus obtaining the nal ROIs (28). All the ROIs were then nally checked by a radiologist with more than 10 years of experience in chest imaging (M.M.) to verify the correct position and correspondence with the underlying CT images. The reproducibility assessment of the features extracted by the two independent segmentation sets of the 58 CT scans (28 COVID, 30 nCOVID) was performed. The extraction of the radiomic features was conducted using PyRadiomics, a exible open-source platform capable of extracting a large panel of engineered features from medical images; this radiomic quanti cation platform enables the standardization of both feature de nitions and image processing (29). To avoid data heterogeneity bias, HRCT images were subjected to imaging resampling (2 × 2 × 2 mm). For each ROI, ten built-in lters (Original, wavelet, Laplacian of Gaussian (LoG), square, square root, logarithm, exponential, Gradient, LBP2D, LBP3D) were applied and seven feature classes ( rst order statistics, shape descriptors, glcm, glrlm, ngtdm, gldm, glszm) were calculated, for a total of 1409 radiomic features.
Machine Learning Approach: Partial Least Square (pls) Regression A Machine learning (i.e. multivariate) approach was implemented to exploit radiomic features multidimensionality (Fig. 2b). When trying to predict an output based on these features, the information redundancy (i.e., radiomic features high correlation), coupled with a low number of independent samples (i.e., subjects), makes the prediction unstable to noise and prone to poor generalization (30,31). To address this problem, two main approaches were implemented. The rst approach was to reduce the number of features by selecting only those that were highly repeatable (r > 0.95) between the two masks (delineated by the two radiologists). The second approach was to implement a machine learning framework based on a linear regression analysis that employed a space dimension reduction procedure, namely the partial least square (PLS) regression (30,32,33). The PLS was used to differentiate COVID from nCOVID patients. PLS allows the construction of regression equations reducing the predictors to a smaller set of uncorrelated components, i.e. a linear combination of the original predictors, and performs regression on these components (33,34). The goal of PLS is to identify components that capture most of the information in the independent variables (e.g., linear combinations of all radiomic features) that is useful for predicting the dependent variable (e.g., COVID vs. nCOVID). PLS is the supervised learning version of the Principal Component Analysis (PCA) (35,36). The learning process ( tting) of the PLS algorithm delivers regression loadings that can be used to retrieve the weights (β-weights) linking the original independent variables with the dependent variable, depicting the importance and sign of the original variables in the prediction. The PLS has one hyperparameter to be optimized, namely the number of uncorrelated components to be used in the regression. To perform hyperparameter optimization and evaluate the generalizable performance of the procedure an approach that allows to minimize the loss of samples during training of the model is the nested cross-validation (nCV) (37). In nCV, data are divided in folds and the model is trained on all data except one-fold in an iterative, nested manner. The hyperparameter optimization and performance assessment are performed on the remaining fold and averaged across iterations. If the number of folds equals the number of samples (one-fold per sample) the procedure is de ned leave-one-out nCV (38,39). This approach is highly suited for medical applications where each sample represents one subject. In this work, a leave-one-out nCV was implemented to optimize the PLS number of components and to assess the PLS generalization performance. The β-weights of the PLS analysis were obtained by running the algorithm on the complete dataset with the optimal number of components delivered by the nCV analysis. The machine learning analyses were implemented in Matlab.

Statistical Analysis
The COVID vs nCOVID classi cation performance was assessed through Receiver Operating Characteristic (ROC) analysis comparing the inferred (out-of-training-sample) with the true group. COVID patients were attributed to the "positive" group, whereas nCOVID patients were attributed to the "negative" group. The ROC analysis was also performed on random shu ed group labels to simulate the null hypothesis and evaluate its con dence interval (repeated 10 6 times). The ROC analysis delivered an Area Under the Curve (AUC), which could be transformed into a z-score for assessing its statistical signi cance by using the random shu ed group labels. The statistical analysis was performed in Matlab.

Results
The majority of patients included in the study were male (n = 34, 59%), and the median age was 66 years (interquartile range 55-81). Out of the total patient population (n = 58), 28 (48%) were assigned to the COVID group, and 30 (52%) to the nCOVID group. The nCOVID group (n = 30) included four with Cytomegalovirus (CMV) pneumonia, two with pulmonary edema, ve with Acute Respiratory Distress Syndrome (ARDS), eight with Organizing Pneumonia, three with Pneumocystis Jirovecii pneumonia, two with In uenza A pneumonia, two with Legionella pneumonia, three with alveolar hemorrhage and one with hypersensitivity pneumonia (Table 1). A total of 1409 radiomic features were extracted. 153 of these features showed an inter-reader correlation of r > 0.95 and were used for further analysis. When employing radiomic features with an r > 0.95, i.e., 153 radiomic features, an AUC = 0.868 was obtained (z = 5.1, p = 4.2·10 − 7 , Fig. 3). The estimated optimal number of PLS components, evaluated within the nCV framework, was 7. The weights of the PLS (β-weights) are shown in Fig. 4. Figure 4a reports the distributions of the β-weights for radiomic features used in the analysis, whereas Fig. 4b depicts the βweights associated to the top 5% of the selected features with the largest β-weights in magnitude, i.e. those most impacting the prediction. Of the top 5% features, 5 (wavelet_LLH_glrlm_GrayLevelNonUniformity, wavelet_LHH_glcm_DifferenceVariance, wavelet_LHH_glrlm_GrayLevelVariance, wavelet_HLH_glcm_DifferenceVariance, wavelet_HHL_glrlm_RunEntropy), were associated to glrlm and glcm texture matrices (second second order features), and 2 (wavelet_LLH_ rstorder_Skewness, Ibp_2D_ rstorder_10Percentile) were related to the image intensity distribution ( rst order features). All, except one, second order features had a negative weight, meaning that COVID-19 patients (labelled as 1 in the classi cation algorithm) tended to have a more homogeneous texture. Of the two rst order features, one had positive weight (the skweness, larger value in COVID-19 patients) and one had a negative weight (the 10th percentile, smaller value in COVID-19 patients) indicating that COVID group, although having a distribution of image intensities with average equal values that nCOVID group, had a larger occurrence of low intensity pixels.

Discussion
Our results showed that a radiomics-based machine learning model has the potential to differentiate the GGOs due to COVID-19 pneumonia from the GGOs associated with non-COVID-19 lung diseases. Two are the major clinical impications of our study. First, due to its high sensitivity, CT has been proposed as the primary diagnostic tool in epidemic areas (13,40). Moreover, the WHO recommended chest imaging for the diagnostic workup of symptomatic patients with suspected COVID-19 if RT-PCR testing was unavailable or delayed or when initial RT-PCR testing was negative but patients had high clinical suspicion of COVID-19 (41). In this context, the early identi cation of the GGO etiology could help to promptly adopt the appropriate management. For instance, patients admitted at the hospital for suspected COVID-19 pneumonia are temporarily placed in dedicated COVID-19 rule-out units, and they may experience a delay in care or intervention (14). In this scenario, chest CT is used as a surrogate for the early identi cation of COVID-19 pneumonia may help to triage activity, both by identifying an alternative diagnosis and by improving the patient selection for intensive/non-intensive care in case of clinical worsening. Secondly, the treatment could differ. For example, patients with organizing pneumonia are usually treated with corticosteroid therapy with the occasional addition of antibiotics (42). On the other hand, corticosteroids are recommended only in patients with severe and critical COVID-19 infection (43).
Other radiomics-based studies have already tested the differential diagnosis of GGOs; however, they were conducted in the oncology eld and took place in the pre-covid era (18,44). Interestingly, most of the features useful to the differential diagnosis regarded the texture analysis and revealed a higher homogeneity in COVID-19 than in non-COVID-19 patients. To the best of our knowledge, our study is the rst attempt to explore the differentiation of COVID-19 pneumonia with predominant GGO pattern with radiomics. Further studies are necessary to explore if and how the texture heterogeneity is related to the COVID-19 pathophysiology. In this regard, recent studies focused on lung oncology demonstrated that an increased degree of heterogeneity is associated with malignant lung cancer in GGO (18,21). We speculated that the higher homogeneity in COVID-19 pneumonia may re ect the degree of in ammatory in ltrate in the early stage of diffuse alveolar damage. In fact, GGOs are typically observed in the exudative phase of COVID-19 pneumonia, which is characterized by interstitial and alveolar oedema, hemorrhage, and hyaline membrane formation. With the progression of the disease (progressive stage), GGOs increase in density and heterogeneity, thus evolving in a more consolidative pattern or with a "crazy paving" pattern (13).
Although our results are promising, there are some limitations. Firstly, our study included a relatively low number of patients. However, our investigation was intended as a proof-of-concept study, and our aim was to explore the feasibility of a radiomics-based model to differentiate COVID-19 pneumonia from other acute lung diseases sharing the GGO as predominant CT pattern. Moreover, since our study was focused on GGO, our inclusion criteria were necessarily strict thus considering only COVID-19 patients with GGO as predominant pattern. In addition, GGOs are typically found in the acute phase of the disease, which may not correspond with the timing of CT. In fact, CT timing is widely in uenced by the patient clinical condition and the treatment response. Secondly, compared to the number of patients included in the study, we analyzed a large number of predictive features. Indeed, the PLS algorithm, thanks to its ability to exploit the high collinearity of the different radiomic features, was nonetheless able to deliver a high prediction performance. However, we expect that by reducing the ratio between features and subject, the model prediction may further increase. Thirdly, this is a retrospectively designed, single-center study. Further prospective and possibly multicentric studies are warranted to de ne a more standardized approach.
In this proof-of-concept study, we showed that a radiomics-based machine learning model can reliably differentiate COVID-19 pneumonia-related GGOs from GGOs due to other acute lung diseases. This approach was tested on CT exams obtained at the time of hospital admission during the pandemic peak.
Therefore, after a careful prospective evaluation in larger multicentric studies, it may help radiologists to rule out COVID-19 pneumonia thus improving the COVID-19 triaging in epidemic areas. Declarations Ethical statement. This study was approved by the local ethics committee. The study used only preexisting medical data, therefore patient consent was waived.

Abbreviations
Data Availability. The datasets generated during and/or analyzed during the current study are not publicly available due to the clinical and con dential nature of the material but can be made available from the corresponding author on reasonable request.