MDA5 score--A Novel Lung Computed Tomography Scoring Method for Anti-Melanoma Differentiation-associated Gene 5-positive Dermatomyositis

Background: Anti-melanoma differentiation-associated gene 5 (MDA5)-positive dermatomyositis (DM)-associated interstitial lung disease (ILD) is a life-threatening disease with a 6-month mortality as high as 50%. Evaluation of pulmonary high-resolution computed tomography (HRCT) is crucial in the prognostic prediction. This study aimed to develop a novel CT visual scoring method for anti-MDA5 positive DM-ILD. Methods: A prospective cohort of hospitalized patients with anti-MDA5 positive DM-ILD was analyzed, and was further divided into a derivation dataset and a validation dataset. The primary outcome was the six-month all-cause mortality since the time of admission. Three components of baseline pulmonary CT images, i.e., ground glass opacity (GGO), consolidation and brosis were semi-quantitatively calculated in different lung lobes. The multivariable COX proportional hazards model was used to identify independent prognostic risk factors and corresponding coecients, based upon which a scoring model was constructed. In addition, an articial intelligence (AI) algorithm-based analysis and an idiopathic pulmonary brosis (IPF)-based scoring were conducted as comparators. The prediction accuracy of different methods was measured and compared by Harrell concordance index (C-index). Results: Overall, 173 eligible patients were included. A novel GGO and consolidation-weighted CT visual scoring model for anti-MDA5 positive DM-ILD, namely ‘MDA5 score’, was developed with C-index values of 0.80 (95%CI 0.75-0.86) in the derivation dataset (n=116) and 0.84 (95%CI 0.71-0.97) in the validation dataset (n=57), respectively. While, the AI algorithm-based analysis, namely ‘AI score’, yielded C-index 0.78 (95%CI 0.72-0.84) for the derivation dataset and 0.77 (95%CI 0.64-0.90) for the validation dataset. ‘MDA5 score’ outperformed and IPF-based scoring (‘IPF score’) with respect to discrimination and clinical usefulness. the to establish a novel pulmonary HRCT visual scoring method for predicting the six-month in a large single-centered cohort of patients with anti-MDA5 positive DM; and in parallel, to

Pulmonary high-resolution computed tomography (HRCT) is a main-stream imaging tool for identifying ILD and measuring its severity. A semi-quantitative HRCT scoring system has been applied as a prognostic prediction measurement in anti-MDA5 positive DM-ILD (5,6). However, the applied scoring system was initially designed for evaluation of idiopathic pulmonary brosis (IPF) (7,8). Therefore, when referring to a more rapid progressive disease, such as anti-MDA5 positive DM-ILD, the applicability has not been extensively validated. As examples, brosis components such as traction bronchiectasis (TBE) and honeycombing changes were higher weighted in this 'IPF score'; whereas in ammation components, i.e., ground-glass opacity (GGO) and consolidation were less weighted. Only until recently, another simpli ed scoring method for anti-MDA5 positive DM-ILD was proposed with equally weighted two components of GGO and brosis (9,10). Unfortunately, the sample size was small with characteristic consolidation feature being overlooked; further independent evaluation in a data-driven approach is warranted. It is noteworthy that the time-consuming observer-dependent manner of these visual scorings is always an issue.
Under the pressure of the coronavirus disease 2019 (COVID- 19) pandemic, advanced machine learning-based technologies on pulmonary CT quantitative analysis have rapidly emerged, providing a promising solution for diffuse lung disease HRCT evaluation in a more comprehensive and objective perspective (11)(12)(13).
Thus, the aims of the current study were to establish a novel pulmonary HRCT visual scoring method for predicting the sixmonth mortality in a large single-centered cohort of patients with anti-MDA5 positive DM; and in parallel, to explore quantitative imaging assessment of this disease by applying arti cial intelligence (AI) algorithm.

Materials And Methods
Patients A prospective cohort of hospitalized patients with anti-MDA5 positive DM-ILD was setup since April 2014 in our center. All patients initially ful lled Bohan and Peter's criteria for DM or Sontheimer's criteria for clinically amyopathic dermatomyositis on admission (14,15), were re-evaluated retrospectively and considered eligible as long as they also met the recent 239 th ENMC classi cation criteria for DM (16). All patients were with imaging-con rmed ILD and positive anti-MDA5 antibody. The anti-MDA5 antibody was detected by immunoblotting assay (Euroimmun, Germany) and con rmed by ELISA (Supplementary Data S1). ILD course was de ned as time from the rst abnormal pulmonary CT which revealed ILD changes to admission. Patients with ILD course > 3 months or with coexisting malignancy (within 3 years) or with pre-existing chronic obstructive pulmonary disease were excluded. The primary outcome was the six-month all-cause mortality since the time of admission.
A total of 173 eligible patients were enrolled and were further divided into two datasets. Patients admitted between April 2014 and December 2018 (n=116) versus those admitted between January 2019 and January 2020 (n=57), were de ned as the derivation dataset and the validation dataset, respectively ( Figure 1).
Clinical data including age, gender, physical ndings, respiratory function, treatment history and outcomes were obtained from medical records. The study protocol was approved by the ethics committees of our hospital and the need to obtain informed consent was waived.

HRCT images acquisition and visual scoring
Patients underwent non-contrast pulmonary HRCT at the day around admission (median, 2 days; range, 1-6 days), using multidetector CT scanner (United Imaging, Shanghai, China; Siemens Healthineers, Forchheim, Germany). CT slice thickness was 1.0-1.5mm at 10mm intervals in the whole lungs.
All CT images were reviewed by two observers (YZ with 10-years' experience and CZ with 5-years' experience in chest HRCT imaging evaluation) who were blinded to patients' outcome. Inter-observer variability was evaluated by Intraclass correlation coe cient (ICC). The results were agreed upon by consensus between the two observers.
For the previously reported IPF-based visual scoring method ('IPF score'), HRCT ndings were evaluated for GGO, consolidation, TBE and honeycombing de ned by the Fleischner Society, and were graded on a scale of 1-6 based on the classi cation system (8). The overall 'IPF score' was calculated by summing the average score of six zones (upper, middle, and lower on both sides) as described; and was used as a comparator for the following analysis.
AI algorithm-based CT quantitative analysis The Digital Imaging and Communications in Medicine les of CT images were inputted and run on a software package named "CT Pneumonia Analysis" (syngo.via Frontier 1.0, Siemens Healthineers, Forchheim, Germany). The algorithm had been rst trained on a large cohort of patients with various diseases, then ne-tuned with a cohort with abnormal patterns including GGO, consolidation, effusions, and masses, to improve the robustness of the lung segmentation over the involved areas. Based on 3D segmentations of lesions, lungs, and lobes, the AI algorithm automatically detected and quanti ed abnormal tomographic patterns commonly present in pneumonia, such as GGO and consolidation both globally and lobe-wise.
The percentage of total opacity (total lesions) as well as the percentage of consolidation (with a cutoff of CT value≥-200 Houns eld unit) was directly calculated for the whole lung. Then by subtracting consolidation from total lesion, the percentage of GGO was obtained for further analysis.

Statistical Analysis
Clinical data were described and compared between the derivation and validation datasets by univariable analysis. The Mann-Whitney U test, Chi-square test and Fisher's exact test were conducted, as appropriate. Clinical features with >5% missing data were excluded for analysis.
Among the three visual scoring components, i.e. GGO, consolidation and brosis, variables signi cantly associated with outcome in the univariable analysis were subsequently included in the multivariable COX proportional hazards model. The derived β regression coe cients were used to construct a linear weighted scoring model, de ned as 'MDA5 score'. Likewise, the percentage of GGO and consolidation from AI algorithm based quantitative analysis were used to construct another weighted scoring model, de ned as 'AI score'.
The optimal cutoff value of CT score was identi ed by receiver operating characteristic curve analysis. The association between CT score and six-month survival were assessed by Kaplan-Meier survival plot and log-rank test.
Model discrimination of the 'IPF score', 'MDA5 score' and 'AI score' models were quanti ed and compared by the Harrell concordance index (C-index) with 95% con dence interval (CI). A decision curve analysis (DCA) was built to determine and compare the clinical usefulness of each model (17). Signi cance was de ned as p<0.05.

Results
Comparable baseline clinical features, treatment and outcomes of the derivation dataset and validation dataset were listed in Supplementary table S1. Of which, 47 (40.5%) and 21 (36.8%) patients died within six-month follow up since the time of admission, respectively (p=0.764).
'MDA5 score': a novel CT visual Semi-Quantitative Analysis The pulmonary HRCT ndings from visual semi-quantitative analysis of patients between survivors and non-survivors in both datasets were presented in Table 1. As expected, the ILD pattern distributed bilaterally. It was noteworthy that only GGO and consolidation patterns were signi cantly associated with outcome according to univariable analysis; as opposed to neither brosis nor the presence of pneumomediastinum or pneumothorax (PNM) at baseline. Then, the GGO and consolidation score were included in further multivariable COX regression analysis. Both total GGO score (β coe cient=0.13, p<0.001) and total consolidation score (β coe cient=0.22, p<0.001) were determined to be signi cantly associated with all-cause mortality ( Table 2). To simplify, a linear equation, namely 'MDA5 score', by combining de ned prognostic factors weighted by their β coe cients was nally generated: total GGO score+2*total consolidation score.
ROC curve analysis indicated that the optimal cutoff value for 'MDA5 score' was 18, which could e ciently predict the sixmonth all-cause mortality in the derivation dataset (sensitivity 70.2%; speci city 82.6%) and the validation dataset (sensitivity 85.7%; speci city 63.9%). The prediction accuracy of 'MDA5 score' calculated by AUC was 0.85 (95%CI 0.78-0.91) for the derivation dataset and 0.87 (95%CI 0.78-0.96) for the validation dataset, far ahead of the 'IPF score', which was 0.81 (95%CI 0.73-0.89) for the derivation dataset and 0.79 (95%CI 0.68-0.91) for the validation dataset. Additionally, The Kaplan-Meier survival plots of patients in both datasets presented signi cant difference between the high-risk ('MDA5 score'>18) and lowrisk ('MDA5 score'≤18) groups (Figure 2). The mortality of high-risk patients was 73.3% in the derivation dataset and 58.1% in the validation dataset; while the mortality of low-risk patients was 19.7% in the derivation dataset and 11.5% in the validation dataset.
'AI score': an AI algorithm-based Quantitative Analysis The redundancy of the baseline brosis component in terms of outcome prediction made pneumonia-trained AI algorithm plausible for our anti-MDA5 positive DM-ILD patients' CT quantitative analysis ( Figure 3A). Percentage of consolidation was determined as the only signi cant predictor for the overall survival in the nal multivariable COX model (p<0.001) ( Table 2). Thus, the percentage of consolidation was de ned to represent 'AI score'. Interestingly, the radar charts in Figure 3B showed that the GGO and consolidation patterns were symmetrically distributed, with an evident 'gravity gradient' propensity to the lower area of the lungs, especially for the consolidation distribution.
Comparisons of Clinical Performance between 'IPF score', 'MDA5 score' and 'AI score' model The inter-observer consistency of the two visual scoring models was assessed with an ICC of 0.69 (95% CI 0.57-0.78) for 'IPF score' and an ICC of 0.93 (95% CI 0.89-0.96) for 'MDA5 score'. Therefore, 'MDA5 score' attained a better inter-observer reproducibility. Hereafter, the comparisons of model discrimination between 'IPF score', 'MDA5 score' and 'AI score' were shown in Table 3. Notably, 'MDA5 score' had the best performance with C-index values of 0.80 (95%CI 0.75-0.86) in the derivation dataset and 0.84 (95%CI 0.71-0.97) in the validation dataset, respectively. While, 'AI score' yielded C-index 0.78 (95%CI 0.72-0.84) for the derivation dataset and 0.77 (95%CI 0.64-0.90) for the validation dataset. Finally, the DCA further demonstrated that the 'MDA5 score' also presented with a higher overall net bene t than the other two models in terms of clinical applicability (Figure 4).

Discussion
As a highly progressive disease, anti-MDA5 positive DM-ILD remains to be a big challenge despite of recent treatment advances (4,18). Several prognostic indicators of the disease had been reported involving respiratory physiology parameters, laboratory biomarkers, and radiology features (1,6,19). The current study focused on patients' baseline pulmonary HRCT and attempted to quantitatively assess the disease in the regard of predicting six-month mortality.
Our study takes a step-forward from the previous visual scoring methods, and extensively evaluates the distribution and extent of three basic imaging components of anti-MDA5 positive DM-ILD, i.e., GGO, consolidation and brosis. In line with prior reports, our data con rmed that the presence of brosis or TBE in the context of GGO or consolidation, is not of predictive value on prognosis in anti-MDA5 positive DM-ILD (5, 10). The probable explanation is that those brotic features are less common in this rapid progressive disease and likely to be presented, if it happens, in later stage instead of baseline. The same notion apparently holds true for the presence of PNM, which is a known severity indicator rather than a baseline predictor (20).
The combination of the extent of GGO and consolidation was found to have the best yield in terms of outcome prediction, with the area of consolidation contributing more than GGO. The image 'snapshot' might re ect the dynamic transformation from GGO to consolidation as disease progresses, just like the imaging changes observed in severe COVID-19 patients (11,12,21). A possible shared underlying mechanism of acute lung injury in the two diseases is a very intriguing question deserves further investigation. After all, the highly activated type I interferon pathway in anti-MDA5 positive DM-ILD which suggested a possible virus-triggered response has been postulated(22, 23).
To apply AI algorithm-based quantitative imaging analysis in anti-MDA5 positive DM-ILD is a preliminary yet novel attempt. The primary targets of the algorithm were for those pneumonia patients, or more speci cally, COVID-19 patients. Of interest, our data suggested that this AI algorithm performs fairly well among anti-MDA5 positive DM-ILD patients. The performance might be further enhanced given more anti-MDA5 positive DM-ILD imaging data could be fed into its machine-learning processes.
The major limitation of our study was the single-center design. Although we presented a relatively large cohort for this rare disease and performed internal validation, large-scale multi-center external validation is mandatory before the CT scoring models being utilized in a clinical setting. Based upon this, the biases of different machine conditions, patient selection and treatment protocols could be taken into consideration and subjected to better control and adjustment. In addition, longitudinal analysis on the changes of ILD patterns over time remains untouched in the current study, which deserves further exploration.

Conclusions
We have shown that a GGO and consolidation-weighted CT scoring model, along with an AI algorithm, might serve as prognostic predictors for six-month mortality in anti-MDA5 positive DM-ILD. This might facilitate future clinical trial design and precision management for this tricky disease. The study was approved by the Shanghai Jiaotong University School of Medicine, Renji Hospital Ethnics Committee. The need to obtain informed consent was waived.

Consent for publication
We obtained consent for publication from all the individuals whose detailed information was included in this manuscript.

Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request.
Competing Interests SY has received speaker's and consulting fees from Abbvie, Jansen Biologics, GlaxoSmithKline, Roche, and P zer, and grants from P zer. All the other authors have nothing to disclose.

Figure 3
Arti cial intelligence algorithm-based CT quantitative analysis. (A) The segmentation results of the lung and its total opacity in representative CT images were shown in green and red borders respectively. The percentage of total opacity, consolidation, and ground-glass opacity (GGO) of the whole lung were automatically calculated to be 35.3%, 20.8%, and 14.5% respectively.  Decision curve analysis for 'IPF score', 'MDA5 score' and 'AI score' model. The concept of population net bene t (NB) is fundamental to decision curves (measured in the y-axis) and referred to classi cation accuracy of a model. Suppose high risk is de ned as risk above some risk threshold R (x-axis); such high-risk patients are recommended an intervention. The NB of using the risk model was calculated by the true-positive rate, the proportion of cases with risk above risk threshold R; and the false-positive rate, the proportion of controls with risk above risk threshold R. The horizontal dotted line at NB = 0 mean a simple policy of no intervention to all patients (treat none); the gray curve in the plot depicted the NB of another simple policy: recommend the intervention to everyone regardless of risk. In our result, the 'MDA5 score' model (red line) had the highest net bene t compared to the others, almost across the full range of threshold probabilities.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.