Classifying subtypes and predicting survival of renal cell carcinoma using histopathology image-based deep learning

Classifying histopathological subtypes and predicting survival of renal cell carcinoma (RCC) 27 patients are critical steps towards treatment. In this work, we first proposed a deep learning 28 method involving patch-based segmentation, intelligent feature extraction and heatmap 29 visualization for classifying RCC into clear cell RCC, papillary RCC, chromophobe RCC, and 30 adjacent benign tissue. This algorithm was trained and validated using 2,374,446 patches, 6,340 31 whole-slide images, 2,399 patients from The Cancer Genome Atlas and 6 independent centers. 32 The classifiers provided areas under the curves of 0.979 to 0.996 in the internal phase, and 0.914 33 to 0.995 in the 6-center external phase. Furthermore, a modified deep learning approach 34 comprising automated detection of regions of interest, patch-level learning, and morphological 35 features-based risk scoring was developed for predicting survival of clear cell RCC patients. 36 The prognostication model provided a hazard ratio for poor versus good prognosis of 2.63 [95% 37 confidence interval (CI) 1.53–4.50, P = 4.35e-4] in the testing set, and 2.57 [95% CI 1.43–4.64, 38 P = 1.68e-3] in an independent validation set using multivariable analyses. In conclusion, the 39 developed histopathology image-based deep learning frameworks have the clinical potential to 40 assist pathologists in systematically evaluating histological information of RCC patients. 41


Introduction
Renal cell carcinoma (RCC) accounts for >90% of all kidney cancers. Epidemiological studies 44 show that RCC represents approximately 2.2% of all cancers, with ~400,000 new cases and 45 ~175,000 deaths yearly 1 . It can be classified into three major subtypes: clear cell RCC (ccRCC), 46 the most common type accounting for 70% of all cases; papillary RCC (pRCC) which 47 represents 15% to 20% of all cases, and chromophobe RCC (chRCC), that accounts for 5% of 48 reported cases 2 . The remaining subtypes are very rare with each accounting for ≤1% of total 49 incidence. Each subtype of RCC has its specific histopathology, genetic characteristics, clinical 50 course, and response to therapy 3 . 51 Subtype classification and outcome prediction of RCC patients are critical steps towards 52 precise treatment. Histopathological slide is the gold standard of RCC subtype and stage 2,4 .

53
Classification and prognostication based on human assessment remain time-consuming and 54 relatively subjective. In some cases, the distinction among RCC types is not clear as they may 55 share non-specific morphological patterns 5 . In addition, RCC is an extremely heterogeneous 56 disease, making prediction of prognosis a great challenge 6 .

78
Patient information. 79 The samples for classification covered 6,340 whole-slide images from 2,399 RCC patients, of 80 which 3,260 slides of 941 patients were from TCGA and 3,080 slides of 1,458 patients were 81 from 6 independent cohorts (Supplementary Table 1   The construction of classifier models involved steps of patch-based segmentation, intelligent 106 feature extraction, and heatmap visualization (Fig. 1 Table 14).

121
Validation of the classifier models in 6 independent cohorts. 122 The classifier models were validated with 3,080 slides from 1,458 patients in the 6 multi-center 123 external phase. In identifying malignancy from adjacent benign tissues, the binary classifier  Table 14).

137
Heatmaps of the prediction probability over slides were generated to discern the tumor 138 regions associated with histological patterns. We have shown representations of images of the 139 four types (Fig. 3a). Heatmap visualizations were produced for which color is proportional to  Table 16). 169 We compared the performances of 10, 20 and 30 representative image patches with the largest 170 average nuclei size for each slide. We observed that the c-index value did not increase with 171 more patches but slightly deceased using 20 and 30 patches compared to 10 patches 172 (Supplementary Table 17). As a result, we used 10 image patches of each slide as inputs to 173 VGG19 for subsequent prognostication.

174
Our survival deep-learning model showed strong prognostic power with a c-index of 0.779, 175 outperforming manual histologic-grade system of 0.748. Kaplan-Meier analysis showed that 176 the prognostic model had better prediction capability than histologic grade (log rank P = 3.49e-177 5 versus P = 2.22e-3) (Fig. 5a,b). The risk index (low risk, high risk) was a prognostic factor 178 for overall survival in a univariate Cox analysis (HR 2.93 [95% CI 1.72-5.00], P = 7.84e-5) 179 (Supplementary Table 18). In multivariate analysis, the risk index was independently prognostic  Table 19). The predicted risk scores were significantly associated with tumor  Table 20). The established scoring algorithm allowed patches to be scored for each patient.

189
Comparison between high-risk patches and low-risk patches was beneficial in obtaining 190 morphological characteristics associated with prognosis ( Supplementary Fig. 7).

191
Validation of the prognostic model in an independent cohort. 192 We validated the model in an independent cohort of 316 RCC patients from TADTH. The model 193 showed a c-index of 0.769, better than manual histologic grade of 0.753. This prognostic model 194 provides significant survival differences between high-and low-risk groups (log rank P = 5.88e-195 4), better than the histologic grade (log rank P = 1.02e-3) (Fig. 5e,f)         Establishment of a prognostic model. 317 ccRCC patients from TCGA were randomly divided into 55% training set, 15% tuning set and 318 30% internal testing set. An independent cohort from TADTH was used for external validation.            In b-d, the predicted heatmaps with probabilities are assigned to each patch, where grey is for 521 patches classified as normal, orange for tumor, blue for ccRCC, green for pRCC, and red for 522 chRCC.