This retrospective study design was approved by the institutional review board (IRB) of the University of Pennsylvania (#821679). This is a retrospective study in which patient data has been fully anonymized. Adequate precautions have been undertaken to ensure protection of patient privacy and confidentiality. The IRB judged there were no other risks to patients related to the performance of the study. Moreover, all study investigators are fully trained in HIPAA compliance. Consequently, the IRB at the University of Pennsylvania waived the need for informed consent in our study. All methods were carried out in accordance with relevant guidelines and regulations approved by IRB at the University of Pennsylvania. Through radiology and medical record searches, we retrospectively identified 40 patients using the following inclusion criteria: a) thoracic HRCT performed with ≤1.5 mm thickness contiguous axial slices and high spatial resolution algorithms; b) presence of fibrotic interstitial lung disease patterns on HRCT (UIP – usual interstitial pneumonia, NSIP – non-specific interstitial pneumonia or CHP – chronic hypersensitivity pneumonitis) by pathology or by a combination of clinical and radiological findings (Figure 1). HRCT datasets were anonymized and transferred to an imaging-processing computer cluster. Concurrently, PFT and clinical data were obtained from the electronic medical record, anonymized, and associated with specific imaging datasets. The dataset was balanced by design: we randomly selected 20 patients with UIP and randomly selected 20 patients with NSIP or CHP patterns of ILD, for maximizing statistical power, even with a relatively small sample size.
Subjective HRCT image analysis was performed in consensus by two thoracic radiologists (two expert radiologists with 11 and 7 years of subspecialty expertise), to assess the presence, spatial distribution and severity of the following basic imaging patterns: reticulation, traction bronchiectasis, honeycombing, ground-glass opacities, consolidations, emphysema and normal parenchyma. Additional findings such as masses or pleural effusions were noted. Accordingly, each patient was classified in one of two groups: definite or probable UIP pattern, versus most compatible with non-UIP, which comprised chronic hypersensitivity pneumonitis and non-specific interstitial pneumonia patterns of ILD. Moreover, the disease severity was stratified using a semi-quantitative score of mild (less than 25% of parenchyma involved), moderate (≥ 25% but ≤ 50% of parenchyma involved) and severe (≥ 50% of parenchyma involved). The dataset was balanced, with 20 patients in the UIP pattern, and 20 patients in the non-UIP pattern. The demographic and clinical characteristics of these 40 patients are summarized in Table 1.
QCT analysis via histogram-based (HM) method
QCT analysis was initially performed using the IMBIO Lung Texture Analysis (LTA) (CALIPER) software, which is currently investigational in the United States and not FDA approved for clinical utilization. The first step of LTA is the segmentation of the lungs, followed by segmentation of the airway tree and pulmonary vessels. The next step is to apply the CALIPER  algorithm to the lung parenchyma, which uses computer vision-based image analysis of volumetric histogram features, and 3D morphology to classify groups of voxels (each containing 15 1x1x1 mm voxels). The detection and quantification of lung parenchymal findings is based on histogram signature mapping techniques trained through expert radiologist consensus assessment of pathologically confirmed training sets obtained through the Lung Tissue Research Consortium (LTRC). The number of voxels in the lung parenchyma classified as each of the fundamental texture categories are calculated and converted to percentages of the combined left and right lung volume, the individual lungs, and the upper, middle, and lower sextants of each of the lungs. Voxels that are identified as vessels are not included in the calculation of the lung volume. LTA then generates as output a new series of DICOM images with multi-color, semi-transparent overlays to indicate the texture categories; and a PDF report that includes a graphic summary of the quantitative results, indicating percentage of basic parenchymal image patterns (normal, ground-glass, reticulation, honeycombing, and hyperlucent), lung volumes and spatial distribution by lung zone.
QCT Analysis via Texture based (TM) method
Our segmentation method is a 3-dimensional, intensity-based algorithm using K-means clustering to properly determine cluster centers of air / lung tissue versus soft tissue attenuation, the latter which we removed from the segmented volume. CT attenuation of the lung parenchyma (measured by Hounsfield units (HU)) for a normal subject lies between -900 to -700; and interstitial fibrosis can raise the parenchymal attenuation to as high as -100 HU, making it impossible to segment the lungs solely based on intensity thresholding. Our method provides an efficient automated lung segmentation (Figure 2), which was further refined with minor (less than 10% volume changes) manual corrections by an expert radiologist on all cases where segmentation results were suboptimal (approximately 25% of the dataset). 75% of the dataset did not require any manual corrections. All segmentation results were reviewed by an expert cardiothoracic radiologist for quality control, and following minor manual corrections as above, the entire dataset was deemed optimally segmented.
QCT feature extraction was subsequently performed via an in-house lattice-based texture estimation software pipeline (Figure 3). This method, capable of capturing the texture heterogeneity of the ILD pattern , is based on a regular grid virtually overlaid on each CT image. Texture features are computed from the intersection (i.e., lattice) points of the grid lines within the lung, using a local cube (window) centered at each lattice point (Appendix provides mathematical details). Using this novel strategy, a comprehensive set of 26 imaging features from three major statistical groups of features, gray-level histogram, co-occurrence, and run-length, were computed and saved as 3D feature maps. Features were calculated using a range of window size (ROI) traversing the image from 4 mm to 20 mm. Different ROI sizes can help to assess texture information at different spatial scales, from the finest to coarsest textures, which may capture different levels of histologic changes (from fine to coarse fibrosis). These features were averaged across each ROI for representing distinct texture phenotypes.
With the 26 3D feature maps, a K-means clustering approach was applied to each of the 40 patients to group the voxels within the lung that share similar feature patterns (ILD sub-types). The choice of k for the number of clusters is important. We applied K=5 for the study, where 5 tissue types were modeled: normal, ground-glass, reticular, honeycombing, and hyperlucent. The clustering results for two sample window sizes 4 mm and 12 mm are shown in Figure 4. The volume ratio of each cluster (the number of pixels for each tissue cluster divided by total number of pixels) were calculated and then fed into a Support Vector Machine (SVM) model to assess QCT ability to predict UIP vs non-UIP diagnosis. Two set of covariates were used in the model; 1: solely imaging features 2: imaging features combined with relevant clinical measures such as age, gender, and severity of the disease. Then, the TM based classification models were compared to HM method for different window sizes (Figure 5).
Although the sample size is relatively small, imaging features extracted were used to build a deep learning classification model with 5-fold cross validation. A neural network based on PyTorch framework with two hidden layers and 66 nodes was used to build the network (Supplementary materials for details).
The retrospective nature of our dataset allowed over 7 years of follow up, enabling correlation with relevant patient outcomes such as death, development of respiratory failure requiring ICU admission or need for lung transplantation.
Using 26 extracted features for ILD patients, Cox proportional regression hazard model  was performed for time-to-event outcome analysis for survival assessment prediction. The C-statistic was used as a measure of predictive performance of features. Two sets of covariates were considered; 1: UIP and non-UIP pattern labels by radiologist experts; 2: QCT imaging TM features with relevant clinical features such as age, gender, and severity for each patient as additional covariates (Figure 6). To reduce the number of covariates and the potential for overfitting, the C-statistic for each feature was evaluated based on a univariable cox regression model with cross validation.