Severity and Outcome Detection for the Cancer Patients with COVID-19 Using Machine Learning Models

The COVID-19 pandemic challenges the healthcare system to provide enough resources to battle the pandemic without jeopardizing routine treatments. As a result, this is important that we can predict the outcomes of patients at the time of admission. This study aims to apply different machine learning (ML) models for predicting Intensive Care Unit (ICU) admission and mortality of Cancer Patients infected with COVID-19. This study's data were collected from a referral cancer center in Iran. The study included all patients with cancer and a conrmed diagnosis of COVID-19. Different ML prediction algorithms like Logistic Regression (LR), Naïve Bayes (NB), k-Nearest Neighbours (kNN), Random Forest (RF), and Support Vector Machine (SVM) were used. Also, we applied the SelectKBest method to nd the most important features for predicting ICU admission and mortality.

Several predictive models have revealed COVID-19 severity and mortality determinants in the general population, but the clinical outcomes and interactions of COVID-19 with cancer and systemic anticancer drugs are poorly understood. Besides diagnosis and treatment, predicting prognosis is needed to reduce the pressure on healthcare systems and deliver the best possible care for patients. Several machine learning (ML) models have been created to help clinicians and improve the diagnosis of COVID-19 (4).
ML models can also predict the COVID-19 outcomes such as Intensive Care Unit (ICU) admission and mortality (5). To date, various ML algorithms have been developed for the diagnosis of COVID-19 and prediction of severity and mortality risk, using clinical and laboratory data(6-10). In comparison, few studies surveyed ML algorithms in predicting COVID-19 outcomes in cancer patients (11,12).
This study aims to apply different ML models for predicting Intensive Care Unit (ICU) admission and mortality in hospitalized cancer patients infected with COVID-19.

Materials And Methods: Data Collection:
This study's data were collected from Omid hospital, a referral cancer center a liated with Isfahan University of Medical Sciences and Isfahan COVID-19 Registry (I-CORE) (13). The study included all patients with active or previous cancer with con rmed severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection by RT-PCR from February 2020 to February 2021. Patients with a radiological or clinical diagnosis of COVID-19 without a positive RT-PCR test were not included in this analysis.
28 features from each patient were collected, including demographics data like age and sex, cancer characteristics including type of cancer, history of recent chemotherapy, COVID-19 symptoms including fever, cough, dyspnea, weakness, diarrhea, nausea, and vomiting. Comorbidities such as hypertension and diabetes were also recorded. All the laboratory data at the time of admission, including White Blood De ning patient outcomes: ICU admission during hospitalization and patient status at the time of discharge (alive/dead) were de ned as patients' outcomes. Machine learning models aim to predict these outcomes based on collected features and select the most important features that affect the outcomes.
The dataset is recorded very carefully; consequently, only less than 4 percent of all features in the whole dataset are detected as missing values. The missing values are all from the lab tests. For lling the missing values, we replace every missing value with the mean of the corresponding feature. After that, we randomly partition the dataset into two sets: the training set containing 305 inputs and the test set containing 34 patients' information. Then, as the third step, we apply a feature scaling algorithm to the training dataset, called standardization, to ensure that none of the features are dominated by other features. To do this, we used the formula in which Var(X) stands for the variance of X.
Prediction Algorithms: We apply several machine learning prediction algorithms, namely Logistic Regression (LR), Naïve Bayses (NB), k-Nearest Neighbours (kNN), Random Forest (RF), and Support Vector Machine (SVM). In the following, we bring experimental settings of the reported prognostic models.
The rst method for providing the results is binary logistic regression which is applied for data sets with "0" and "1" class labels.
The second technique is naïve Bayes which is a conditional probability model. Assume that x is the feature vector corresponding to input, and C_1 and C_2 are the two possible label classes. Using Bayes theorem in probability theory, the model computes conditional probabilities Pr(C_1| x) and Pr (C_2 | x) and then, based on the comparison of these two values, decides for the label of x. Since there are binaryvalued variables in our dataset, we use the Bernoulli version of naïve Bayes.
k-Nearest Neighbors (kNN) classi er is another algorithm that is applied to this dataset. In this algorithm, the new input label is determined based on the labels of k nearest samples in the dataset. The user speci es the input parameter k, which is typically small. We set this variable to be seven by applying some different values.
Random Forest (RF) method is based on the bootstrap aggregating technique because the prediction of a single tree is very sensitive to noise in its training set, while the average of many trees is not if the trees are not correlated. Consequently, this algorithm reduces the variance of the model in some sense. To run the algorithm, you can determine the maximum number of steps to go to the depth of the tree for each tree. We found the best value for this parameter based on our training dataset. Therefore, we set this tuning parameter to be 13; however, if it is not determined, nodes are expanded until all leaves are pure or until all leaves contain less than the minimum number of samples required to split an internal node.
As another method, we use SVM. Let us have an N-dimensional features vector. The Support Vector Machine (SVM) model approach nds a hyperplane in an N-dimensional space with the maximum margin that distinctly classi es the data points.
Prediction Performance Evaluation Metrics: To explain these measures, rst, we need some preliminary de nitions. As we have a binary classi cation problem, each label can be seen as positive or negative. By a true positive (TP) value, we mean a correctly predicted positive label. Similarly, a true negative (TN) stands for an indeed detected negative label. A false positive (FP) value shows an input data point that is predicted by positive while its actual label is negative. Finally, a false negative (FN) value is used for a data point where its real class is positive, but the algorithm predicts its label as negative.
One of the most common evaluation measures is accuracy, which is de ned as the ratio of correctly predicted items to the total number of items. If we have imbalanced label classes, then accuracy cannot be applied as a good evaluation criterion. For example, if 98 percent of the labels are positive in a dataset, then the trivial algorithm which assigns positive to all the inputs reaches an accuracy of 98 percent. In this case, to better understand the algorithm's performance, we should realize how much of each label class is predicted correctly.
The second parameter, precision, is the ratio of truly detected positive labels to the total number of predicted positive labels in the test set. In other words, Precision=TP/(TP+FP).
Also, recall is de ned as the number of correctly predicted positive labels divided by the total number of positive labels. Therefore, Recall=TP/(TP+FN). In binary classi cation, it is worth noting that recall of the positive and negative classes are called sensitivity and speci city, respectively. Finally, F1-score, a mixture parameter, is the weighted average of precision and recall, i.e., F1-score= (2 (Precision* Recall))/(Precision+ Recall)=TP/(TP+1/2(FP+FN)). It is worth noting that this score is much better than Results: Three hundred thirty-nine patients enrolled in the study with a positive PCR from 26 February 2020 to 15 February 2021. One hundred fteen patients from this cohort were admitted to the Intensive Care Unit (ICU), and 118 patients died during the hospital admission. Table 1 shows the demographic and characteristics of patients.  Tables 2 and 3. Moreover, a list of the ve most important features is obtained for each outcome prediction. Table 4 shows the results of feature selection. Our feature detection approach shows that in different outcomes, we have 3 common features, which are: C-Reactive Protein (CRP), Neutrophil to Lymphocyte Ration (NLR), and Aspartate Aminotransferase (AST).

Discussion:
We used machine learning algorithms to identify clinical variables predictive of severe COVID-19 illness in cancer patients at time zero. We compared different AI models, and NB and Rf have high performance with AUC of 0.74 and 0.79 in predicting ICU and death, respectively.
Cancer patients are more susceptible to COVID-19 infection and COVID-19-related complications regardless of active cancer (14). Several studies have shown that cancer patients infected with COVID-19 have higher mortality risks. For instance, a study in the United Kingdom (UK) showed that 28 percent of cancer patients admitted to hospital died because of COVID-19, a higher rate than the normal population (15). The results of another study by The US COVID-19 and Cancer Consortium (CCC19) showed that 13% of cancer patients died, and 26% became severely ill (14,16).
Prediction of which patients are at higher risk of progression and poor outcomes can help clinicians better decisions during the critical time of disease course. Up to now, many ML prediction models have been developed which are useful in the clinical setting. Among all of them, we focused on ve different methods LR, NB, RF, kNN, and SVM. Since we have imbalanced data and a small sample size for data of positive label, it is clear that the methods on this class label data are not as good as those negative ones. If it is important to have high performance on positive class labels, i.e., high sensitivity value, it is recommended to use probabilistic algorithms like NB and LR. Otherwise, according to ROC curve results, we can apply the most dominating methods, i.e., RF and NB.
Around 30 percent of data have a positive class label, and the remaining have a negative one. Since we have the problem of a small sample size for data points having the positive class label, we cannot capture their structure properly; thus, it is expected that most of the methods perform weakly in terms of sensitivity. By analyzing the results of Tables 2 and 3, we can also observe these meaningful differences between sensitivity and speci city measures. However, the logistic regression and naïve Bayes are two probabilistic methods where they could consider the uncertainty which exists in the embedding of data points. For this reason, the results of these two methods are better than others in terms of sensitivity metrics.
The most frequently reported predictors for the prognosis of COVID-19 cases were age, CRP, lymphocyte, and LDH in the general population (17).
Our results showed that between the routine laboratory tests, increased CRP, NLR, and AST are the best predictors of the severity and mortality in cancer patients infected by COVID-19.
Several studies have shown that the increased level of CRP can predict disease severity and outcome in patients with COVID-19 (10,18,19). It has been suggested that an excessive immune response or 'cytokine storm' plays a critical role in COVID-19 severity and outcome.
High AST value may suggest that SARS CoV2 may cause damage to multiple organ systems, including the liver, when emerging as a severe in ammatory disease. NLR plays a key role in maintaining immune homeostasis and the in ammatory response in the body. Several previous reports have shown the prognostic value of NLR in COVID-19 patients (20,21).
Our study has some limitations: 1. We analyzed a medium-sized dataset of 339 cancer patients; larger, more comprehensive datasets of cancer patients are needed to test the power of generalization of our approach.
2. Other algorithms and features are needed to predict how clinical variables change over time may affect future outcomes.
3. We did not report how many days in advance our model can produce predictions. Availability of data and materials: The datasets of the study are not publicly available due to patient con dentiality, but a de-identi ed version will be made available from the corresponding author on reasonable request.