Automatic renal carcinoma biopsy guidance using forward-viewing endoscopic optical coherence tomography and deep learning

Abstract Percutaneous renal biopsy (PRB) is commonly used for kidney cancer diagnosis. However, current PRB remains challenging in sampling accuracy. This study introduces a forward-viewing optical coherence tomography (OCT) probe for differentiating tumor and normal tissues, aiming at precise PRB guidance. Five human kidneys and renal carcinoma samples were used to evaluate the performance of our probe. Based on their distinct OCT imaging features, tumor and normal renal tissues can be accurately distinguished. We examined the attenuation coefficient for tissue classification and achieved 98.19% tumor recognition accuracy, but underperformed for distinguishing normal tissues. We further developed convolutional neural networks (CNN) and evaluated two CNN architectures: ResNet50 and InceptionV3, yielding 99.51% and 99.48% accuracies for tumor recognition, and over 98.90% for normal tissues recognition. In conclusion, combining OCT and CNN significantly enhanced the PRB guidance, offering a promising guidance technology for improved kidney cancer diagnosis.


Introduction
Renal cancer is a disease characterized by the uncontrolled growth of kidney cells, resulting in the formation of tumor tissue.The lifetime risk of renal cancer was estimated to be ~ 2.02% in males and ~ 1.03% in females [1].American Cancer Society projected approximately 81,800 new cases of renal cancer in the United States in 2023, including 52,360 males and 29,440 females, and 14,890 (9,920 males and 4,970 females) would die from it [1].Renal cell carcinoma (RCC) is the most prevalent type of renal cancer, constituting approximately 95% of all renal cancer cases, and it has shown a consistent rise over the past few decades [2,3].Accurate identi cation and diagnosis of renal tumor malignancy is critical for devising effective treatment plans.At present, percutaneous renal biopsy (PRB) is the most commonly used procedure to extract renal tissue for pathological analysis [4].Since being proposed in 1951, PRB has been widely applied in the diagnosis and prognosis of kidney diseases [5].It has been considered as an effective method for the assessment of renal tumors from patients with RCC.During the PRB procedure, the patient is placed in a prone position.An exploratory needle is then inserted into the kidney to establish a path for the subsequent insertion of the biopsy needle [6].The biopsy needle is employed to extract a tissue sample from the tumor.The obtained tissue sample is processed, analyzed, and classi ed as benign, malignant, or nondiagnostic for patients with RCC [7].
Despite being widely used for kidney disease diagnosis and renal tumor tissue evaluation, PRB still faces challenges in the tissue localization.It has been reported that even with the assistance of a radiologist, approximately 92% of PRBs required two or more attempts in order to accurately extract the targeted tissue [8].More insertion attempts raise the likelihood of complications, including hematoma and infection [9,10].Therefore, accurate guidance of needle insertion is of great importance for effective and safe renal carcinoma biopsy.Conventional imaging modalities have been employed for PRB needle guidance.Ultrasound has been harnessed to assist in needle guidance, leading to a signi cant reduction in the incidence of severe PRB complications over the past few decades [11,12].Computed tomography (CT) has also been utilized in the diagnosis of metastatic lesion during percutaneous biopsy [13].Three-dimensional cone-beam CT guidance has been documented as bene cial for the secure and effective determination of needle trajectory, as well as for performing biopsies on small renal masses [14].
Moreover, the guidance of PRB through magnetic resonance imaging (MRI) exhibits promising outcomes in precisely targeting the renal tumor and accurately locating the PRB needle tip, particularly in obese patients [15].Although these macroscopic imaging methods, including ultrasound, CT and MRI, are effective in needle trajectory determination, they cannot accurately identify tissue types ahead of the needle tip due to their limited spatial resolutions [16].As a result, it was reported that ~ 10% − 20% of the biopsy results were non-diagnostic due to the limitation of imaging guidance of tumor location [16].
Hence, there is an unmet need for an imaging method capable of accurately identifying tissue types ahead of the PRB needle.
Optical coherence tomography (OCT) is a well-established imaging technique with micron-level spatial resolution [17].Therefore, OCT has the potential for providing enhanced ne-scale visualization during PRB needle guidance.Moreover, OCT has already demonstrated feasibility in diagnosing renal masses, with the ability to differentiate renal tumor tissue from normal renal tissue based on distinct attenuation coe cients in OCT images [18].OCT can also provide real-time imaging [19].Therefore, we hypothesized that OCT can be used in the real-time guidance of the PRB needle by detecting and identifying the renal tumor tissue.To enable tissue imaging ahead of the PRB needle, we have developed a forward-view OCT probe that incorporates gradient-index (GRIN) lenses functioning as a forward-viewing endoscope.Our previous work successfully demonstrated the application of endoscopic OCT in percutaneous nephrostomy (PCN) guidance, accurately identifying various kidney tissues within pig models [20,21].
Hence, the OCT probe system holds the promise in PRB needle guidance and renal tumor recognition.
Because visually identifying tissue from OCT images incurs a signi cant learning curve for radiologists and PRB operators, we have developed automatic imaging processing techniques, including tissue classi cation using attenuation coe cient and deep learning methods.Mapping optical attenuation coe cients has already been employed in distinguishing between cancerous and non-cancerous tissues [22].Given that cancerous tissue tends to exhibit higher density in the imaging results, its attenuation coe cient typically surpasses that of normal tissues.Therefore, attenuation coe cient can be a promising feature used for tumor tissue recognition during PRB needle guidance.Furthermore, deep learning has been demonstrated to be an effective approach in tissue classi cation and recognition for surgical guidance [21,23,24].In this study, we have applied convolutional neural networks (CNNs) for automatic identi cation of normal renal tissue and kidney tumor.The performance of these methods was benchmarked for renal tissues classi cation and tumor diagnosis.

Experimental setup
We constructed our forward-viewing optical coherence tomography (OCT) probe based on a swept-source OCT (SS-OCT) system [21,24,25].The light source utilized was a 1,300 nm swept-source laser with a bandwidth of 100 nm.The scanning rate of our system reached 200 kHz.As illustrated in Fig. 1, the light was initially divided using a ber coupler (FC) into two distinct paths: one directed with 97% of power towards a circulator for the OCT imaging, and the other with 3% power directed to a Mach-Zehnder interferometer (MZI) to initiate the sampling process.Upon traversing the circulator, the laser underwent another division through another FC into the reference arm and sample arm.Subsequently, the light re ected by the mirror in the reference arm and the light backscattered by the tissue in the sample arm converged, generating an interference signal.This signal was then denoised using a balanced detector (BD), then acquired by the data acquisition (DAQ) board and processed and visualized on a computer in the end.Polarization controllers (PC) were employed in both arms to mitigate background noise.
In PRB procedures, 14 gauge (G) or 16 G biopsy needles are commonly used [26].To assess feasibility, we employed GRIN lenses with a diameter of 1.3 mm and length of 138.0 mm, allowing them to be accommodated within the 14 G or 16 G biopsy needles.These GRIN lenses possess a refractive index distribution that varies perpendicular to the optical axis, facilitating effective image transmission from one end to the other [27].
In our con guration, we positioned a GRIN lens in the sample arm ahead of the OCT scanning lens for endoscopic imaging.Additionally, another same GRIN lens was situated in the reference arm to counteract dispersion.The proximal end of the GRIN lens in the sample arm was positioned at the focal point of the scanner lens, maximizing the delivery of light to the sample.Our system features an axial resolution of approximately 11 µm and a lateral resolution of around 20 µm.It provides a sensitivity of 92 dB and covers a eld of view (FOV) of approximately 1.25 mm in diameter, rendering it well-suited for our intended application.

Data acquisition
The study was approved by the University of Oklahoma Institutional Review Board (Study number: 12462).Written informed consent was obtained by Lifeshare of Oklahoma.Patient demographics were collected upon obtaining consent.For our dataset, we utilized ve human kidney samples and ve renal carcinoma samples.OCT images of carcinoma tissue and different normal tissues were imaged by our OCT probe.To keep the organ as fresh as possible, the kidney was preserved by hypothermic machine perfusion (HMP), and we started the experiment as soon as the sample was acquired.
Figure 1(B) illustrates one kidney sample with carcinoma, clearly showcasing the tumor tissue alongside other normal renal tissues including cortex, medulla, calyx, fat, and pelvis.To emulate a practical biopsy procedure during the experiment, we carefully inserted the OCT probe into different renal tissue types, applying controlled force to compress the tissue.OCT image acquisition was successfully performed in the nal.
For quantitative identi cation of tumor tissue, we utilized both conventional methods, involving the calculation of attenuation coe cients, and deep learning techniques for automated recognition of renal tissue types.To train the machine learning model, we obtained a dataset of 10,000 OCT images from the tumor and ve different normal renal tissues, respectively (Fig. 1(C)).The size of all output images was set as lateral X depth: 650×1050 (X×Z: 1.30 mm×2.10 mm), so each result covered the same FOV for attenuation coe cient method and CNN process (Fig. 1(D)).

Attenuation coe cient method for tissue recognition
Our OCT ndings provided in-depth structural insights into renal tissues.The attenuation coe cient serves as a crucial parameter, signifying the decrease in signal intensity observed in OCT images as depth increases [28].In structural OCT images, the imaging intensity re ects the backscattering level of the tissue sample, so different attenuation coe cients demonstrate different tissue features and can be used for tissue classi cation.The use of attenuation coe cient for cancer detection has been extensively documented [29][30][31].Compared to normal tissues, cancerous tissue is typically denser thus its imaging signal value attenuated faster in vertical direction and represented by higher attenuation coe cient.The attenuation coe cient of each A-scan can be extracted and used to generate a spatial attenuation coe cient distribution, which helps visually distinguish tumor and normal tissues [22].
Here we used Beer-Lambert law to describe the structural OCT intensity signals in depth as follows: 1 where represents the initial incident light intensity, is the attenuation coe cient value which represents the decay rate, and is the depth [32].Thus, can be described as: 2 is the ratio of recorded OCT intensity in depth to the initial incident light intensity.The intrinsic optical attenuation coe cient was calculated according to the slope of OCT intensity in a 250-pixel depth window and then averaged in a 200-pixel width region as shown in Fig. 1(D).

CNN methods for tissue identi cation
Two CNN architectures, ResNet50 and InceptionV3, were evaluated.ResNet50 has ~ 25.6 million parameters and InceptionV3 has ~ 22 million parameters.We used a nested 4-fold cross-validation and 5fold cross-testing procedure [21], in which each iteration rotated three subjects to the training dataset, one subject to the validation dataset, and one subject to the testing dataset.This division enabled us to obtain a more accurate estimation of the generalization error, as every subject was included in the testing dataset once.Early stopping was implemented with a patience of 20.The optimizer used for the training was stochastic gradient descent with Nesterov accelerated gradient, utilizing a momentum value of 0.9 and a learning rate of 0.01.Additionally, the learning decay rate was set to be 0.1.The average epoch count from the cross-validation folds was subsequently employed to train the model intended for use on the test dataset.The test performances of the obtained models were benchmarked using the 5-fold crosstesting.The classi cation performance was measured using four metrics, including accuracy, precision, recall, and F 1 score as de ned below: where TP is true positive, FP is false positive, TN is true negative, and FN is false negative.
To interpret the prediction of deep learning models [33,34], GradCAM [35] was used to generate pixel saliency heatmaps.One of the ve subjects' images were used to generate the heatmaps for both CNN models: InceptionV3 and ResNet50.For example, all images of the cortex from the chosen subject were utilized.We calculated the sum of the corresponding pixel points of the images, then the sum was divided by the number of total images used, resulting in an average heatmap image of cortex.Same practice applied to the rest of the tissue types.
3 Results   In the OCT results, there is a bright line on the top (pointed by a yellow arrow) which corresponds to the surface of the GRIN lens.The intensity signal distributions of both cortex and medulla are relatively homogeneous, although the imaging depth of medulla surpasses that of cortex.Calyx exhibits alternately shaded stripes, a result of its transitional epithelium and brous tissue composition.Fat presents the darkest pattern, accompanied by bright dots characteristic of adipocytes.We injected synthetic urine into the pelvis and tied the ureter during the experiment and inserted the probe into the pelvis for imaging.As an empty space for collecting urine, the pelvis has no tissue signal under the GRIN lens surface.The renal tumor, being the pivotal tissue type for identi cation during PRB, showcases the shallowest imaging depth while presenting the highest brightness among all OCT results.Its structural features exhibit irregularities.The corresponding histology results were also presented in Fig. 2.These tissues showed different tissue structures and distributions and correlated well with their OCT results.Different normal renal tissues and tumor tissues can be categorized based on their distinctive OCT imaging features.As a result, our endoscopic OCT probe has the potential to assist in the recognition of tissue types ahead of the needle during PRB procedures.

Classi cation using attenuation coe cient method
We rst assessed the performance of attenuation coe cient method in distinguishing renal tumor and normal tissues.The attenuation coe cient did not work for the pelvis, because the values of most pixels were zero and the logarithm could not be calculated.Therefore, only tumor tissue and other four normal tissue types were tested using the attenuation coe cient method.
This study involved the processing of a total of 50,000 OCT images (after removing pelvis category), with each tissue type contributing 10,000 images.The performance of the results was benchmarked using the cross-testing procedure.Four kidneys were randomly selected as the training dataset, while the remaining kidney served as the testing dataset.To calculate the attenuation coe cient, the region of interest (ROI) window was selected to be 200×250 pixels for each OCT image.The ROIs were chosen from the central sections of the images and the coordinate of the area is selected in the top middle position which includes most tissue information (Fig. 1(D)).The mean value of all the attenuation coe cient results from every vertical direction was used as the attenuation coe cient level for each OCT image.
Fig. 3 displays the attenuation coe cient distributions for the ve tissue types (four normal renal tissues and tumor tissue).Firstly, considering cortex, medulla, calyx, and fat are all normal tissues, these four types were initially consolidated into a single classi cation.In this testing, all the tumor tissue images from 5 samples (50,000 images in total) were utilized.To make the number of normal tissue images equal to the number of tumor images, we randomly selected 2,500 images from each normal tissue in every sample (2,500 images × 4 tissue types × 5 samples = 50,000 images).In general, the tumor tissue and the normal tissues displayed distinctive distribution characteristics, thereby enabling a clear differentiation between tumor and normal tissues as shown in Fig. 3(A).Normal distributions were tted to the attenuation coe cient distributions as shown in Fig. 3(A).To quantify the separation of the tumor from normal tissues, the cross point (blue dot) of two curves was chosen as threshold and used to evaluate the accuracy.The receiver operating characteristic (ROC) curves were shown between tumor and normal renal tissues in Fig. 3(B).The area under the ROC curves (AUC) (Tumor_Normal) was 99.95%, which validated the tumor recognition accuracy based on attenuation coe cients.We then evaluated classi cation of the renal tissue types by using 10,000 OCT images from 5 samples (50,000 images in total).The attenuation coe cient distributions were shown in Fig. 3(C).To quantify the separation of the tumor from each normal tissue (i.e., tumor vs cortex, tumor vs medulla, tumor vs calyx, and tumor vs fat), the cross point (blue dot) of two curves was again chosen as threshold and used to evaluate the accuracy.The AUCs for differentiating the tumor tissue from each normal renal tissue were all over 99.85% (Fig. 3(B)), which further proved the tumor recognition capacity of using attenuation coe cients.
However, different normal renal tissues were di cult to classify using attenuation coe cient method.The AUCs were 100% for cortex vs. medulla, 88% for cortex vs. calyx, 81% for cortex vs. fat, 86% for medulla vs. calyx, 95% for medulla vs. fat, and only 66% for calyx vs. fat (Fig. 3(D)).The confusion matrix for multi-class classi cation was further shown in The needle's precise placement can enhance surgical e ciency and overall effectiveness.The present results primarily contributed to tumor recognition, but the performance in identifying other normal kidney tissues still needed improvement.In particular, the attenuation coe cient methods misclassi ed over 500 images of calyx as tumor tissue, causing potential sampling errors.

Tissue recognition results using CNN
The recognition of kidney tissues using the ResNet50 and InceptionV3 CNNs was evaluated using nested cross-validation and cross-testing.The ROC curves and pixel saliency heatmaps were shown in Fig. 4.
The confusion matrixes and performance metrics were shown in Table 2.
The ResNet50 CNN yielded a tumor recognition accuracy of 99.51%.For normal tissue recognition, the prediction accuracies were 98.96% for cortex, 99.78% for medulla, 99.09% for calyx, 99.27% for fat, and 99.95% for pelvis.97.72% of the misclassi cations of tumor samples were erroneously categorized as cortex.None of cortex, medulla and fat images were predicted to be tumor, and 1.48% of the calyx images was falsely predicted as tumor.The InceptionV3 CNN reached 99.48% tumor prediction accuracy.The accuracies for normal tissue classi cation were 99.58% for cortex, 99.88% for medulla, 99.27% for calyx, 99.65% for fat, and 99.95% for pelvis.Calyx recognition has the lowest accuracy, and 0.02% of tumor samples were erroneously categorized as calyx, which is different from the ResNet50 model.Both ResNet50 and InceptionV3 achieved >99.9% ROC AUC for all tissues.
InceptionV3 outperformed ResNet50 in overall precision for the classi cation of the normal renal tissues, while with a slightly lower precision in the tumor classi cation.All normal tissues and tumor results using InceptionV3 have higher recalls than ResNet50 except for calyx.
All computations were conducted on a server equipped with two NVIDIA RTX A6000 graphics cards.The cross-validation required an average of 1,435 minutes per fold.The average inference time over 320 images was 18 milliseconds per image for ResNet50 and 11 milliseconds per image for InceptionV3.
The average pixel importance heatmaps of all the images from the two models were shown in Fig. 4.
These heatmaps highlighted the regions important for the models to distinguish different tissue types.
The narrow area below the GRIN lens surface line was important for the tumor recognition.The important area for the pelvis recognition included a narrow part on the top and a wide area on the bottom.Heatmaps of the other four tissues all highlighted the top half of the images with strong signals.The important areas used by ResNet50 were sparser and wider than those used by InceptionV3.This could be attributed to the fact that ResNet50's feature map was more focused and localized within each residual block than InceptionV3, so it could capture ne-grained details in the images in pixel level.

Discussion
Although PRB is the most commonly performed procedure for evaluating renal tissue, it remains a challenge even for experienced urologists to precisely extract tumor tissue with minimal sampling error.In this study, we tested the feasibility of using a forward-viewing OCT probe for RCC detection on ex-vivo human kidney for the rst time.Five human kidneys with renal carcinoma were utilized in our experiments to validate our approach.The renal tumor tissue and different normal renal tissues, including cortex, medulla, calyx, renal fat, and pelvis were all included to evaluate the recognition accuracy of our system.We applied the probe on these tissues and obtained their OCT images.The OCT images of different renal tissues could be classi ed based on their distinct imaging characteristics.Moreover, the OCT probe's dimensions were designed to t inside the PRB needle's hollow bore, enabling seamless integration without introducing any additional invasiveness.Under clinical settings, our OCT probe can image the tissue in front of the PRB needle in real time, effectively identifying the tissue type at the needle's tip.
To automate the process of tissue identi cation, we utilized and compared two classi cation methods in our study: attenuation coe cient and deep learning.Remarkably, ResNet50 and InceptionV3 achieved tumor recognition rates of 99.51% and 99.48%, respectively, both of which surpassed the performance of the attenuation coe cient method.Equally impressive was the accuracy in normal tissue recognition, with rates consistently exceeding 99% across all tissue types in both models (except cortex prediction using ResNet50 of 98.96%).ROC curves for both models showed AUC values consistently exceeding 99.9%.This observation underscores the effectiveness of the CNN-based approach for the classi cation of different normal renal tissues compared to the attenuation coe cient method.The outstanding performance of these models can be attributed, in part, to their substantial parameter sets.ResNet50 and InceptionV3 feature 25.6 million and 22 million parameters, respectively, in contrast to the single parameter employed by the attenuation coe cient method.To further expedite processing times, future experiments could be conducted for the use of smaller architectures without compromising performance.

Conclusion
We tested the feasibility of employing an endoscopic OCT probe for the diagnosis of renal cancer.Renal tumor and normal kidney tissues can be distinguished based on their OCT images.To automate this process, we utilized both attenuation coe cient and deep learning methods for renal tissue recognition.The attenuation coe cient enabled the differentiation between tumor and normal tissues, while the CNN method further classi ed tumor tissue as well as different normal renal tissues accurately.By combining the deep learning approach with the forward-viewing OCT probe, we envision the creation of a precise imaging guidance tool for the PRB procedure.Moreover, our system can complement the established clinical PRB guiding methods, including ultrasound, uoroscopy, and CT by providing high-resolution images in front of the PRB needle in real time.

Figure 2
Figure 2 illustrates the OCT imaging results obtained through our OCT probe, showcasing both renal carcinoma tumor and ve distinct normal renal tissues.Two dimensional (2D) cross-sectional OCT images of these tissues demonstrate different structural features.The corresponding histology results are also shown in Fig. 2.

F
Accuracy = (T P + T N )/(T P + F P + T N + F N ) P recision = T P /(T P + F P )Recall = T P /(T P + F N )

(
A) Schematic of the forward-viewing OCT probe.FC: ber coupler;PC: polarization controller; MZI: Mach-Zehnder interferometer; BD: balanced detector; DAQ: Data acquisition.(B) Picture of the human kidney with tumor.(C) Data acquisition procedure.(D) Two classi cation methods used for tissue classi cation.Page 16/18

Figure 2 OCT
Figure 2

Table 1 .
(A) Confusion matrix of the averaged ve kidneys.(B) Prediction results: Accuracy; Precision; Recall; and F1 Score of different tissue types.
(A), and the corresponding prediction results were shown in Table1(B).Tumor demonstrated high recognition performance, with 98.19% accuracy, 91.85% precision, 99.83% recall, and 95.67% F 1 score.However, the normal tissue recognition was challenging, especially tumor, cortex, medulla, calyx, renal fat, and pelvis, respectively.Our study evaluated the attenuation coe cient for identi cation of tumor and normal tissues.Discrimination of tumor against other normal tissues was promising with the accuracy of 98.19%.While the results were generally favorable, misrecognition still happened.Additionally, relying solely on the attenuation coe cient proved challenging when it came to distinguishing between different types of normal renal tissues.The classi cation accuracy results were not satisfactory, particularly for the calyx prediction accuracy which fell below 80%.Accurate identi cation of normal renal tissues prior to biopsy needle insertion played a pivotal role in precisely tracking the needle's location and avoiding sampling error.To address this critical need, we further evaluated the deep learning methods in our study.Speci cally, we employed two well-established CNN architectures, ResNet50 and InceptionV3, both of which demonstrated robust performance in tumor recognition and the classi cation of various normal renal tissues.
[37]attenuation coe cient method has been widely used in OCT-based cancer diagnosis, including breast cancer[30], prostate cancer [36], bladder cancer[31], colon cancer[37], etc. Renal mass diagnosis using OCT was also reported, and it exhibited highly promising outcomes in predicting RCC[38].For the classi cation task, we imaged 10,000 OCT images from renal