Improvement of Peripheral Nerve Visualization Using a Deep Learning-based MR Reconstruction Algorithm

Objective: To assess a new deep learning-based MR reconstruction method, “DLRecon,” for clinical evaluation of peripheral nerves. Methods: Sixty peripheral nerves were prospectively evaluated in 29 patients (mean age: 49±16 years, 17 female) undergoing standard-of-care (SOC) MR neurography for clinically suspected neuropathy. SOC-MRIs and DLRecon-MRIs were obtained through conventional and DLRecon reconstruction methods, respectively. Two radiologists randomly evaluated blinded images for outer epineurium conspicuity, fascicular architecture visualization, pulsation artifact, ghosting artifact, and bulk motion. Results: DLRecon-MRIs were likely to score better than SOC-MRIs for outer epineurium conspicuity (OR=1.9, p=0.007) and visualization of fasicular architecture (OR=1.8, p<0.001) and were likely to score worse for ghosting (OR=2.8, p=0.004) and pulsation artifacts (OR=1.6, p=0.004). There was substantial to almost-perfect inter-reconstruction method agreement (AC=0.73-1.00) and fair to almost-perfect interrater agreement (AC=0.34-0.86) for all features evaluated. DLRecon-MRI had improved interrater agreement for outer epineurium conspicuity (AC=0.71, substantial agreement) compared to SOC-MRIs (AC=0.34, fair agreement). In >80% of images, the radiologist correctly identied an image as SOC- or DLRecon-MRI. Discussion: Outer epineurium and fasicular architecture conspicuity, two key morphological features critical to evaluating a nerve injury, were improved in DLRecon-MRIs compared to SOC-MRIs. Although pulsation and ghosting artifacts increased in DLRecon images, image interpretation was unaffected.


Introduction
MR neurography is challenging due to the small size of peripheral nerves, some less than 1-2 mm in maximal caliber, [1] and the need to evaluate both the outer epineurium and inner fascicular architecture. [2] High spatial resolution (<0.5mm), cross-sectional acquisition is therefore important. In the extremities, 0.3 mm in-plane resolution is currently achieved at 3.0 Tesla eld strength. [3] However, realizing this resolution within clinically reasonable scan times (<6 minutes) and with adequate signal-to-noise ratio (SNR) is challenging. [4] Acceleration techniques to reduce scan time or increase spatial resolution include parallel imaging [5,6] and compressed sensing, [7] and are facilitated via high channel surface coils to improve SNR. However, acceleration methods incur SNR penalties due to under-sampling and noise ampli cation,[6] which may obscure relevant image details when high acceleration rates (beyond ~2x) are used. A high nerve-tomuscle contrast-to-noise ratio is also required, as nerves often course adjacent to or within muscles with similar contrast.
Another approach to obtaining diagnostic quality images within reasonable scans times is to leverage arti cial intelligence (AI) for denoising, [8] super-resolution, [9] artifact reduction, and/or reconstruction of under-sampled data. [10] Broadly, AI-enhanced image reconstruction algorithms can operate alongside conventional algorithms, including parallel imaging or partial Fourier, to compensate for noise ampli cation or blurring. Alternatively, AI has been used to reconstruct images directly from fully or undersampled k-space data. One such AI method, deep learning based AIR Recon DL (DLRecon), [11] operates on raw image data alongside conventional parallel imaging algorithms during image reconstruction. DLRecon is speci cally designed to perform both image denoising and Gibbs ringing removal, ultimately producing images with high SNR and sharp edges. [11] The DLRecon AI was previously trained on a curated database of over 4 million training iterations and applies to twodimensional (2D) imaging of any anatomy, contrast weighting, or coil con guration. [11] This technique was recently applied to thin-slice pituitary gland MRI [12] and late gadolinium enhancement in myocardial scar quaniti cation, [13] where DLRecon demonstrated similar or better diagnostic performance relative to the conventional MR image reconstruction method.
Given limitations of conventional image reconstruction methods for MR neurography and previously reported concerns about compromise of image delity with AI, [14] it was desirable to evaluate DLRecon's ability to improve image quality and its agreement with the existing, standard of care (SOC) image reconstruction method. We hypothesized the following: (i) DLRecon would enhance clinically relevant imaging features of peripheral nerves without increasing artifact; and (ii) DLRecon would improve interrater agreement of these measures compared to SOC-reconstruction.

Approval for Research in Human Subjects
This study was approved by the institutional review board of Hospital for Special Surgery and conducted in accordance with the Health Insurance Portability and Accountability Act. Informed consent was obtained from all individual participants.

Study Design and Study Population
In total, 29 subjects were prospectively enrolled and in these subjects 60 peripheral nerves were evaluated. These nerves were chosen for evaluation as they are commonly evaluated in clinical practice and were the largest in diameter within the imaged eld-of-view. Written informed consent was obtained from all patients prior to imaging performed between February 2019 and November 2019.

Inclusion Criteria
All patients who presented to our institution for standard-of-care MR neurography evaluation for clinically suspected neuropathy were considered for study inclusion.

Exclusion Criteria
Exclusion criteria were standard MRI safety contraindications. No patients were excluded from the study.

Image Reconstruction
In addition to the untouched SOC reconstruction (i.e. images immediately derived from the scanner), the raw data was retrospectively reconstructed with a deep convolution neural network, i.e. "DLRecon", [11] a vendor-provided software installed on the scanner. This neural network accepts raw un ltered complex valued image inputs and outputs images with higher SNR and reduced truncation artifacts by utilizing a feed-forward approach. [11] Gibbs ringing that occurs near sharp edges is also removed by the neural network, resulting in increased image sharpness. [11] DLRecon was previously trained using pairs of images containing conventional MR images and 'nearperfect' images, de ned as those with high resolution, minimal ringing, and very low noise levels. [11] Four million unique image pairs were employed in this supervised learning approach, and image augmentations such as rotations, ips, intensity gradients, phase manipulations, and Gaussian noise were used to increase the robustness of the training set. [11] The training images were diverse, allowing generalizabitlity of DLRecon's application across anatomical locations. [11] Additionally, the network was trained using a gradient backpropogation and ADAM optimizer. [11,15] Image Analysis Anonymized images were evaluated on a picture archiving and communication system (PACS) (Sectra V18.1, Sectra AB) by two board-certi ed radiologists: reader 1 (DBS) with 6 years of dedicated MR neurography experience, and reader 2 (AJB) with 10 years of general musculoskeletal MRI experience. Both readers underwent a training session to establish grading consensus by reviewing both SOC-and DLRecon-MRI images from 10 separate datasets not included in the analysis. Study images were randomized prior to evaluation by the 2 readers, who remained blinded with respect to the postprocessing method. Each reader independently scored images for outer epineurium conspicuity and visualization of fascicular architecture by reviewing the entire volume for slices that best demonstrated the nerve in an orthogonal plane, as this is the best plane to visualize fascicular architecture. For evaluation of pulsation artifact, ghosting artifact, and bulk motion, readers considered all slices. Outer epineurium conspicuity was de ned by the number of distinct borders visualized between the nerve and immediately surrounding perineural fat (maximum 4: anterior, posterior, medial, and lateral). Fascicular architecture was graded using the following Likert scale: 1-poor visualization, 2-average visualization, 3-good visualization, 4-excellent visualization. The presence of pulsation artifact, ghosting artifact, and bulk motion was graded using the following scale: 0-none, 1-mild, 2-moderate, 3-severe. Additionally, each radiologist was asked to 'guess' as to whether each image dataset was processed with SOC-MRI or DLRecon-MRI to determine the extent of potential bias from perceiving image texture differences.

Statistical Analysis
Statistical analyses were performed by a biostatistician (BL) with 5 years of experience. Odds ratios (OR) and 95% con dence intervals (CI) obtained from marginal ordinal logistic regression models estimated with generalized estimating equation were used to evaluate for differences in grades between DLReconand SOC-MRIs. Patients were treated as the repeated factor to account for any within-patient correlations between image type as well as patients from whom more than one nerve was evaluated in their exam.
Given that ORs were calculated as a comparison of DLRecon-MR and SOC-MR images, an OR of 1 was interpreted as no difference between DLRecon-and SOC-MRI; an OR >1 was interpreted as the DLRecon-MRI being more likely to have a higher grade than the SOC-MRI; and an OR <1 was interpreted as the SOC-MRI being more likely to have a higher grade than the DLRecon-MRI. Statistical signi cance was set a priori to p< 0.05.
Agreement between DLRecon-and SOC-MRI grades for each reader (inter-reconstruction agreement) was analyzed using ordinal-weighted Gwet's agreement coe cients (AC). Clustered bootstrap con dence intervals were used to account for patients who had more than 1 nerve examined. Strength of agreement was determined using the following scale: <0 = poor, 0.00-0.

Inter-reconstruction and Interrater Agreement
For inter-reconstruction agreement, each reader had substantial to almost-perfect agreement for all imaging features and artifacts evaluated (AC=0.73-1.00) (

Discussion
This study demonstrated e cacy of an AI-reconstruction algorithm (DLRecon) to improve peripheral nerve evaluation on MRI. Speci cally, the integrity of two morphologic features (the outer epineurium and fascicular architecture) that are critical to determine the presence and extent of peripheral nerve injury, were shown to be more conspicuous with DLRecon relative to SOC reconstruction. Study ndings were concordant with the expected outcomes of the algorithm, namely denoising and improved sharpness to enhance edge de nition.
One potential concern of AI-based reconstruction is reduced image delity, resulting in over-smoothing of the images and loss of image details. This was partially addressed by evaluating both improvement (based on the OR) and inter-reconstruction agreement to determine whether there was signi cant bias from either the training dataset or algorithm. Conspicuity of nerve features was improved with DLRecon and yet inter-reconstruction agreement was high, suggesting improved image quality with no substantial change in interpretation. Apparent changes in image smoothness with DLRecon could bias reader interpretation. In fact, despite blinding, readers were mostly able to correctly identify data sets as either SOC-or DLRecon-MRI. This suggests the presence of noticeable image texture differences between the two methods; anecdotally, readers observed visible noise reduction and image sharpening in DLRecon compared to SOC images in the muscle and bone. Nonetheless, any potential bias was quali ed by the substantial agreement between the two reconstruction methods.
DLRecon increased pulsation artifacts and ghosting artifacts. As a by-product of denoising and increased sharpness, artifacts were likely better delineated as these were inherent to the acquired data. However, these artifacts likely did not impede interpretation: 1) ghosting artifacts appeared in the air surrounding the anatomy and 2) pulsation artifacts were largely offset from nerves due to the anterior-to-posterior phase-encoding direction being orthogonal to the predominantly axial course of the nerves. The relative absence of artifact impact on image interpretation is also supported by the fact that nerve assessment was improved with DLRecon.
DLRecon's effect on improvements in interrater agreement was greater for outer epineurium conspicuity than for fascicular architecture. For this study, both readers had dedicated musculoskeletal MRI but variable MR neurography experience, which could possibly explain the improved interrater agreement in outer epineurium conspicuity. We speculate that these differences were attributable to DLRecon's effects on variable image textures: the fascicular architecture being more point-like and the outer epineurium being more edge-like.
Study limitations include a moderate sample size, and varying anatomy and imaging parameters that increased variability in the acquired data. MR neurography, as performed at our institution, frequently involves imaging protocols tailored to each case in order to maximize diagnostic yield (in particular spatial resolution), which invariably results in different sampling matrices even for the same anatomical region. Anecdotally, the effect of DLRecon is even greater at matrices smaller than used in this study; as such, greater improvements could be realized with uniform, albeit low-resolution parameters. Another limitation was that a single contrast (intermediate-weighted) was evaluated, as that was among the most common contrast acquired among the different MRI protocols. MR neurography also uses heavily-T2weighted fat-suppressed sequences, which were not evaluated in this study. Speci cally, the majority of MR neurography scans at our institution employ either 2D multi-echo, Dixon-based FSE or 3D short-tau inversion recovery FSE sequences, both currently incompatible with the DLRecon software. A limitation related to interpretation was that the two readers analyzed the entire image stack rather than scoring single images per nerve. While this likely caused increased scoring variability, this approach was chosen to better re ect standard clinical practice.
One advantageous feature of DLRecon is that it can be applied alongside standard reconstruction and acquisition schemes as it neither alters k-space coverage nor requires a different acquisition type. As edges and details are typically time-sensitive to obtain, due to encoding for image details residing at the outer edges of k-space, we believe MR neurography to be a desirable application for DLRecon. However, we believe DLRecon can be applied to other common musculoskeletal MRI exams, particularly for the detection of chondral and labral abnormalities in the hip and shoulder. In our study, AI improved overall image quality for the same acquired data, but it could instead be used to increase acquired spatial resolution (in-plane and/or through-plane) for the same scan time (e.g., via parallel imaging) or to reduce scan time (fewer acquisitions, higher bandwidth) while maintaining the same resolution. In the near future, we envision DLRecon's application to 3D acquisitions, which may further increase the possibilities of improving image quality as 3D data generally provides higher SNR than 2D. Finally, as DLRecon is applied to denoise image space dimensions, denoising methods that operate in other dimensions such as echo time [17] and diffusion [18] may be used in combination with DLRecon to further improve image quality.
In conclusion, the results of this study suggest that MR images reconstructed with the novel DLRecon method demonstrate improved outer epineurium and fasicular architecture conspicuity compared to MR images reconstructed with the SOC method. Given that outer epineurium and fasicular architecture consiquity are two key morphological features that are critical to evaluating a nerve injury, these improvements could add diagnostic value to the assessment of clinically suspected peripheral neuropathy. Although DLRecon images had greater pulsation and ghosting artifacts compared to SOC images, they did not affect image interpretation. Given the results of this study, future work will focus on assessing the added clinical value of the DLRecon method in detection of chondral and labral abnormalities in the hip and shoulder and other common musculoskeletal exams.

Declarations
Data Availability: The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request. Strength of agreement was interpreted using the following scale: less than 0, poor agreement; 0.00-0.20, slight; 0.21-0.40, fair; 0.41-0.60, moderate; 0.61-0.80, substantial; and 0.81-1.0, almost-perfect agreement. 16 AC= ordinal weighted Gwet's agreement coe cient. cbCI= patient-clustered bootstrap con dence intervals to account for patients who had more than 1 nerve examined. Figure 1 23-year-old female presenting for non-speci c median nerve symptoms. Axial FSE intermediate-weighted

Figures
images processed with DLRecon (B) demonstrated more prominent ghosting artifact (score=2, moderate) and pulsation artifact (score=2, moderate) as compared to the same images processed with the SOC reconstruction method (A, score=1, mild for both ghosting and pulsation artifact), as seen in the magni ed insets (arrows, red inset for ghosting; arrows, orange inset for pulsation). However, the increased ghosting and pulsation artifacts did not interfere with nerve conspicuity (yellow inset). The median nerve appeared normal on the exam.