A computer-aided diagnosis system based on computed tomography images of pancreatic head adenocarcinoma applied for identifying pancreatoduodenectomy resection

Background: In a pathological examination of pancreaticoduodenectomy for pancreatic head adenocarcinoma, resection margins have no cancer cells within 1 mm, the resection is considered as R0 resection; resection margins have cancer cells within 1 mm, the resection is recognized as R1 resection. The pathological examinations of the resection margins are complicated and depend on the subjective experiences of physicians to some extent. This study aims to design a computer-aided diagnosis (CAD) system based on texture features of preoperative computer tomography (CT) images to evaluate a resection margin was R0 or R1. Methods: This study retrospectively analyzed 86 patients who were diagnosed as pancreatic head adenocarcinoma by preoperative abdominal CT examination. These patients underwent pancreaticoduodenectomies, then their resection margins were pathologically diagnosed as R0 or R1. The CAD system consists of ve stages: (i) delineate and segment regions of interest (ROIs); (ii) by solving discrete Laplacian equations with Dirichlet boundary conditions, t ROIs to rectangular regions; (iii) enhance textures of ROIs combining wavelet transform and fractional differential; (iv) extract texture features combining wavelet transform and statistical analysis methods; (v) reduce features using principal component analysis (PCA) and perform classication using support vector machine (SVM), use a linear kernel function and leave-one-out cross-training and testing to reduce overtting. Mann-Whitney U-test is used to explore associations between texture features and histopathological characteristics. Results: The developed CAD system achieved an AUC (area under receiver operating characteristic curve) of 0.8614 and an accuracy of 84.88%. Setting p-value ≤ 0.01 in the Mann-Whitney U-test, two features of run-length matrix, which derived from diagonal subbands in wavelet decomposition, showed statistically signicant differences between R0 and R1. Conclusions: It indicates that the developed CAD system is rewarding for discriminating R0 from R1. Texture features can potentially enhance physicians' diagnostic ability.


Background
Pancreatoduodenectomy is the main treatment for pancreatic head adenocarcinoma. Knowledge of preoperative assessment of cancer resection and excision expansion will help to choose optimal therapies for patients. Thus, it is very important to evaluate the resection margin of pancreaticoduodenectomy. In a pathological examination of pancreaticoduodenectomy for pancreatic head adenocarcinoma, resection margins have no cancer cells within 1 mm, the resection is considered as R0 resection; resection margins have cancer cells within 1 mm, the resection is recognized as R1 resection. The pathological examinations of the resection margins are complicated and depend on the subjective experiences of physicians to some extent. This study tries to design a computer-aided diagnosis (CAD) system based on texture features of preoperative computer tomography (CT) images to evaluate a resection margin was R0 or R1.
Intertumoral heterogeneity is generally considered as a typical nding of malignancy. It re ects variations in tumor-cell differentiation, extracellular matrix, and cellularity angiogenesis [1]. In recent years, non-invasive techniques based on texture analysis are widely used to quantify tumor heterogeneity by evaluating spatial variations of gray-level in images, and thus have been applied to lesion-related aided diagnosis, e cacy evaluation, and prognosis [2]. This is termed radiomics [3][4]. CT is a commonly used examination for diagnosis of pancreatic head adenocarcinoma. To the best of our knowledge, there are currently no radiomic researches yet to evaluate differential diagnosis of R0 and R1 using texture features. However, there have been some similar texture feature-based radiomic researches of pancreatic cancer on portal-venous phase CT images. In 2017, Cassinotto C et al. [5] used Laplacian of Gaussian (LoG) lter and histogram to extract texture features to evaluate pathologic tumor aggressiveness and predict disease-free survival in patients with resectable pancreatic adenocarcinoma; Eilaghi A et al. [6] used gray-level co-occurrence matrix (GLCM) to extract texture features to assess whether CT-derived texture features predict survival in patients undergoing resection for pancreatic ductal adenocarcinoma; Chakraborty J et al. [7] used histogram, GLCM, gray-level runlength matrix (GLRLM), and angle co-occurrence matrix (ACM), etc. to extract texture features to predict 2-year survival of pancreatic ductal adenocarcinoma (PDAC). In 2018, Canellas R et al. [8] used LoG lter and histogram to extract texture features to assess whether CT texture analysis and CT features are predictive of pancreatic neuroendocrine tumor grade based on the World Health Organization classi cation and to identify features related to disease progression after surgery; Qiu JJ et al. [9] used histogram, GLCM, wavelet transform, and the methods of their combinations to extract texture features of non-enhanced CT images to explore the feasibility of discriminating pancreatic cancer from normal pancreas. In 2019, Cheng SH et al. [10] used LoG lter and histogram to extract texture features to determine if CT texture analysis measurements of the tumor are independently associated with progression-free survival and overall survival in patients with unresectable PDAC. These similar researches based on texture features to establish regression, neural network, support vector machine, Bayesian and other models for classi cation and prediction.
We evaluated whether an operation was performed by R0 resection or R1 resection based on its surgical margins of portal-venous CT images, and investigated differences of histopathological characteristics between R0 and R1 by using statistical signi cance tests of texture features. This study has been approved by the Ethics Committee of West China Hospital of Sichuan University (Trial registration: NCT02928081).

Methods
In an R0 or R1 resection margin, region of interest (ROI) is an irregular strip-shaped area, its structure contains complex internal details such as capillary distribution, cancer cell tissue, and pancreatic cell tissue, etc. Statistical texture analysis methods are appropriate for this. Multi-resolution texture analysis methods perform well in extracting detail features. However, both statistical texture analysis methods and multi-resolution texture analysis methods are limited to irregular strip-shaped and small ROIs. We developed a CAD system. It relieved these limitations and performed classi cation on R0 and R1. Figure 1 illustrates the framework of the CAD system. It consists of ve stages.  Abdominal scan and enhanced scan were performed using 64-slice spiral CT of American GE. Collimator was set to 0.625 mm, FOV was set to 350 mm × 350 mm, tube voltage was set to 120 kV, tube current was set to 160 mAs, layer thickness was set to 1.250 mm. In enhanced scanning, iopamide was injected via cubital veins, ow rate was 3 ml / s, dose was 90 ~ 100 ml, delayed time was 25 ~ 30 s for scanning of portal-venous phase.
We used portal-venous phase CT images as the objects of radiomic analysis. A CT image was exported as an 8-bit grayscale image (the range of gray-level was [0, 255]).

Delineation and Segmentation
Steps of delineating and segmenting are as follows: 1) choose three portal-venous phase CT images from each case, which located at the top, middle, and bottom of a tumor; Figure 2 [11] illustrates the locations; 2) delineate resection margins around portal veins on the chosen images; it is shown in Figure 3; to ensure authenticity of signals, the delineated resection margins exclude edges of stent and metal artifacts; 3) segment the delineated areas to form ROIs using region growing.
Two physicians with 10 years of experience in abdominal CT diagnosis delineated all resection margins. The rst physician delineated the resection margins, and repeated the delineations after 2 weeks to prevent measured deviations. The other physician only delineated the resection margins once to assess whether his delineations was consistent with the delineations of the rst physician.

Fitting ROIs
As can be seen from Figure 3, ROIs are irregular strip-shaped regions. An image is a two-dimensional signal based on rows and columns. We tted the strip-shaped ROIs to rectangular ROIs by solving discrete Laplacian equations with Dirichlet boundary conditions. The tting method is abbreviated as LD. LD has a good application in signal tting [12][13][14]. It can t missing information of an original image very well. Discrete Laplace equation can be de ned in Eq. (1).
[Due to technical limitations, this equation is only available as a download in the supplemental les section.] (1) Eq. (1) shows that a linear equation can be established based on a 4-neighborhood of a point (to be tted) . A region to be tted is named as a mask. If a current pixel is on an edge of the mask, then at least one of its neighbors (on the Dirichlet boundary) is known. A set of linear equations can be established along the Dirichlet boundary (along edges of the mask). The values of pixels to be tted can be obtained by solving that set of linear equations. This solving procedure is then extended into interiors of the mask. Figure 4 shows a mask to be tted and its boundaries. Figure 5 illustrates two tting examples.

Enhancing ROIs
In histopathology, an ROI of R1 has cancer cells, some parts of its tissue are more compact, and its capillary distribution is less; while an ROI of R0 has no cancer cells, it only contains pancreatic tissue, its capillary distribution is more abundant [15][16]. However, these differences are just qualitative in details and di cult to visually observe from CT images. Multi-resolution analysis methods are advantageous in local time-frequency analysis and are appropriate for deriving detail characteristics. Statistical analysis methods can usually derive representative mathematical descriptors. It can be inferred that multi-resolution analysis methods and statistical analysis methods are appropriate here. They are two types of texture analysis methods that are frequently used in radiomics, and they are also frequently used in the radiomic researches related to pancreatic cancer in recent years [4][5][6][7][8][9][10]. Furthermore, to improve the performances of these texture analysis methods, the CAD system enhances textures of the tted ROIs before extracting texture features. The main purpose of texture enhancement is to highlight high-frequency contour information (detailed information, that is, portions of graylevels that changes relatively more varied or more quickly) while preserving low-frequency smoothing information as much as possible. Traditional enhancement methods such as histogram equalization, integerorder differentials, frequency enhancement lters, etc., increase contrast or highlight contours, but they lose lots of low-frequency texture information and usually sharpen contour information. In recent years, applying fractional differentials in medical image processing compensates for the drawback of greatly losing lowfrequency information, making it an effective method for texture enhancement [17][18][19]. As stated above, we consider the following 3 factors: 1. Wavelet transform is appropriate for detail analysis of an image, and its characteristic of perfect inverse transform enables corrections of transform coe cients to be highlighted in the reconstructed image.
2. Fractional differential can enhance contours without sharpening edges.
3. From the histopathological analysis, the characteristics of the details can well characterize R0 and R1.
We designed a texture enhancement method with reference to Grumwald-Letnikov (G-L) fractional differential de nition and wavelet transform [20][21]. The enhancement method is abbreviated as WF. It's illustrated in Step 1: Decompose an ROI into 4 components using wavelet transform (22): H (horizontal), V (vertical), and D (diagonal), which represent high-frequency components; A (approximate), which represent low-frequency component. The approximate component can be decomposed again.
Step 2: Convolve each high-frequency component (including all high-frequency components in decompositions of all levels) with a fractional differential operator M.
Step 3: Perform wavelet inverse transform based on the convolution results of Step 2 and the approximate component in the last-level decomposition.
Wavelet inverse transform will reconstruct the ROI, which is the enhanced ROI. In the WF method, the steps we construct the fractional differential operator M are as follows: 2) Expand Eq. (3): it's know that h=1 (unit interval), Eq. (4) can be derived.
[Due to technical limitations, this equation is only available as a download in the supplemental les section.] We constructed a fractional differential operator named M based on the expanded coe cients of Eq. (4). Figure  7 demonstrates the operator M. The operator M performs fractional differential operations in eight symmetric directions in a 5×5 neighborhood. The c at the center point position is an adjustable parameter and is called compensation parameter. In experiments, the order v and the parameter c can be appropriately adjusted. Figure  5 illustrates two enhancing examples using the WF method.

Texture Analysis
Deep learning algorithms have made signi cant progress in image pattern recognition. However, these algorithms are limited by problems of small samples, small targets, etc. [23][24]. Moreover, deep learning algorithms lack pertinence in quantitative analysis of ROIs, which requires analysis of between quantitative data and clinical outcomes, or analysis of between quantitative data and histopathological characteristics.
Therefore, ROI-based radiomics is the main approach of CAD systems based on medical images.
As described above, it recommends extracting texture features using multiresolution analysis methods and statistical analysis methods. Actually, some similar studies also used these two types of methods [5][6][7][8][9][10]. We repeated these texture analysis methods in experiments to compare them with the method in this paper. In this paper, we combined these two types of analysis methods in order to better describe the details.
We used a texture analysis method that combining wavelet transform and statistical methods (histogram, cooccurrence matrix, and run-length matrix). Reverse biorthogonal wavelets (rbio) are compactly supported biorthogonal spline wavelets for which symmetry and exact reconstruction are possible with FIR ( nite impulse response) lters. Wavelet of rbio was used in this research. The steps of feature extraction are as follows: Step 1: Perform wavelet transform on an ROI (has tted and enhanced); a decomposition will derives 4 components; a coe cient matrix uniquely expresses a component. Step [Due to technical limitations, this equation is only available as a download in the supplemental les section.] Step 3: Extract features from subband images using histogram, co-occurrence matrix, and run-length matrix.
Considering that the size of a subband image is also small, gray-levels of pixels are rescaled and then statistical methods are applied.

Feature Reduction and Classi cation
Reducing features can usually improve classi cation performance. We used principal component analysis (PCA) for feature reduction and limited the number of features to reduce over tting. Empirically, it is appropriate that the number of features is 1/5 or 1/10 of the number of samples, and a linear classi er allows for more features.
Support vector machine (SVM) [25] is widely used due to its outstanding performance in pattern recognition problems of small samples. To reduce over tting, we used a linear kernel and used leave-one-out cross-training and testing. Linear kernel-based SVM allow more features without easily over tting. In the vast majority of cases, especially in classi cation problems of small samples, the model evaluated in the leave-one-out method is close to the model that expected to be evaluated using training data. Thus, evaluation results of the leaveone-out method are often considered more accurate [26].

Results
For comparison, we also used some other texture analysis methods from similar researches, and also applied the PCA-based feature reduction method and the linear SVM-based classi cation method. These methods are shown in Table 1.
Considering that the size of an ROI is small, the experiments performed 1-level wavelet decomposition, and set the distances of co-occurrence matrix to 1 and 2. Feature values of 4 directions (0, 45, 90, and 135) of a cooccurrence matrix were averaged, so was a run-length matrix. Wavelet transform should be performed on rows and columns. Before applying the WT method and WT-HCR method, we lled ROIs into valid matrices based on interpolation methods. The linear interpolation method was rst applied, then we ll the remaining missing values using the nearest interpolation method. Literature 7 used multiple texture analysis methods and obtained the best performance using the ACM methods. Thus, we used the ACM-D method and ACM-M method separately. The LD-WF method used a reverse biorthogonal wavelet, and selected rbio2.8 through multiple experiments. Figure 8 illustrates two examples of decomposing ROIs using rbio2.8.
Binary classi cation problems can use a confusion matrix to express results. R1 is used as positive class, R0 is used as negative class. Table 2 shows the experimental results. The LD-WF method achieves the best classi cation performance, its accuracy and AUC are 84.88% and 0.8641, respectively, followed by the LOG-GH method and the CTM method. Although the accuracy of CTM is lower than that of LOG-GH, its AUC value is higher than LOG-GH. The ROC (receiver operation curve) and AUC (area under the ROC) are powerful indicators for measuring a binary classi cation model. They illustrate the diagnostic ability of the classi er when its discrimination threshold changes. To investigate the discriminations of texture features between RO and R1, we performed Mann-Whitney U-tests on the texture features that extracted based on the LD-WF method. Table 3 shows the features with p-value ≤ 0.05, which usually means that there are statistically signi cant differences between the two types of samples (R0 samples and R1 samples). It demonstrates that the middle and bottom ROIs present more differences on texture features, and the diagonal subband image expresses more characteristic differences in details. The pvalues of F4 and F6 are ≤ 0.01, which means that there are extremely signi cant differences between the two types of samples on statistics.

Discussion
Radiomics uses computer methods such as computer vision and machine learning to perform digital medical image processing, which can deeply mine the heterogeneous data at levels of tissue and molecular that contained in medical images such as CT images (2,(27)(28). CT imaging is that X-rays penetrate different media with different attenuations to form different gray-levels. Thus, grayscale patterns in CT images should be able to re ect changes of body's pathology. An R0 resection margin does not contain pancreatic head adenocarcinoma cells. An R1 resection margin contain pancreatic head adenocarcinoma cells. From histopathological analysis, an R1 resection margin contains large number of normal pancreatic tissue and some tumor tissue, and its capillary distribution is less than an R0 resection margin; relatively, an R0 resection margin only contains normal pancreatic tissue and its capillary distribution is more abundant. Thus, characteristics of internal details can better characterize the two types of samples. Analogous to wavelet transform, LOG-GH is also a multi-scale analysis method. Both the two types of methods are suitable for characterizing detail characteristics. From the classi cation results, the multi-resolution or multi-scale analysis methods behave better.
In addition, it is necessary to address some issues such as the problem of irregular strip-shaped ROIs and the problem of atypical manifestations of details (macroscopically di cult to distinguish). The CAD system in this research used the LD-WF method to process ROIs, tted the ROIs and enhanced textures, then combined wavelet transform and statistical methods to extract descriptors of detail characteristics. The experimental results indicated that such processing pronouncedly improved classi cation performance.
We expect that some texture features should be able to re ect these differences. Three features were selected based on the ascending order of p-values. Table 3 shows these three features in bold: F4, F6, and F9. To test the feature values larger or smaller, right-tailed hypothesis tests based on Wilcoxon rank sum test were performed on F4, F6, and F9, where the alternative hypothesis states that the median of R1 samples is greater than the median of R0 samples. Table 4 demonstrates the results of right-tailed hypothesis tests.   Table   3 indicates that points with larger oscillations appears more continuously in ROIs of R1 than that of in ROIs of R0, this should be associated with the fact that: the ROIs of R1 contains normal pancreatic tissue and cancer tissue, while the ROIs of R0 only contains normal pancreatic tissue.
As for feature F6, it is similar to F4. Short run high gray-level emphasis (SRHGE) is a supplement to HGRE, indicating that points with larger oscillations ( ne texture) appear more continuously.
As for feature F9: 1) the meaning of diagonal subband image has explained above; 2) the cubic moment of histogram measures skewness, higher skewness means greater degree of asymmetry; 3) the test result for F9 in Table 3 indicates that the degree of asymmetry in R1 is greater than that in R0; it should still be associated with the fact that: the ROIs of R1 contains normal pancreatic tissue and cancer tissue, while the ROIs of R0 only contains normal pancreatic tissue; because R0 has only normal pancreatic tissue, the structural changes on the diagonal component are relatively more uniform and more symmetry.
As analyzed above, it can be inferred that, as for R0 and R1, there are associations between histopathological characteristics and texture features. These texture features with statistical differences are markers associated with discriminating between R0 and R1. They may potentially enhance physician's ability to differentially diagnose R0 and R1. This is rewarding for future radiomic studies of e cacy evaluation and prognosis. In addition, classi cation results indicate that the CAD system play an important guiding role for differential diagnosis of R0 and R1.
This study has some limitations and de ciencies. First, it was a retrospectively study in a single institution, patients population and imaging methods were basically homogeneous and selection bias may exist, making it di cult to generalize the results to other institutions. Then, ROIs were tted to rectangles, but the pixel size of a ROI is still small. Thus, better tting methods are worth exploring. Third, sensitivity still needs to be improved.

Conclusions
By analyzing histopathological characteristics of the two types of resection margins and some de ciencies of ROIs, we designed a CAD system based on portal-venous CT images to identify whether surgery was conducted

Consent for publication
Not applicable.

Availability of data and materials
The datasets during and/or analyzed during the current study available from the corresponding author on reasonable request pending the approval of the institution and trial/study investigators who contributed to the dataset.   Figure 1 Framework of the CAD system Choose three CT slices [11] Figure 3