A vendor-agnostic, PACS integrated, and DICOMcompatible software-server pipeline for testing segmentation algorithms within the clinical radiology workflow

Background Reproducible approaches are needed to bring AI/ML for medical image analysis closer to the bedside. Investigators wishing to shadow test cross-sectional medical imaging segmentation algorithms on new studies in real-time will benefit from simple tools that integrate PACS with on-premises image processing, allowing visualization of DICOM-compatible segmentation results and volumetric data at the radiology workstation. Purpose In this work, we develop and release a simple containerized and easily deployable pipeline for shadow testing of segmentation algorithms within the clinical workflow. Methods Our end-to-end automated pipeline has two major components-1. a router/listener and anonymizer and an OHIF web viewer backstopped by a DCM4CHEE DICOM query/retrieve archive deployed in the virtual infrastructure of our secure hospital intranet, and 2. An on-premises single GPU workstation host for DICOM/NIfTI conversion steps, and image processing. DICOM images are visualized in OHIF along with their segmentation masks and associated volumetry measurements (in mL) using DICOM SEG and structured report (SR) elements. Feasibility is demonstrated by recording clock times for a traumatic pelvic hematoma cascaded nnU-net model. Results Mean total clock time from PACS send by user to completion of transfer to the DCM4CHEE query/retrieve archive was 5 minutes 32 seconds (+/− SD of 1 min 26 sec). This compares favorably to the report turnaround times for whole-body CT exams, which often exceed 30 minutes. Inference times accounted for most of the total clock time, ranging from 2 minutes 41 seconds to 8 minutes 27 seconds. All other virtual and on-premises host steps combined ranged from a minimum of 34 seconds to a maximum of 48 seconds. Conclusion The software worked seamlessly with an existing PACS and could be used for deployment of DL models within the radiology workflow for prospective testing on newly scanned patients. Once configured, the pipeline is executed through one command using a single shell script. The code is made publicly available through an open-source license at “https://github.com/vastc/”, and includes a readme file providing pipeline config instructions for host names, series filter, other parameters, and citation instructions for this work.


Introduction
Unmet need for open-source software integrating DL models into quantitative visualization clinical work ows. -Simple, reproducible approaches are needed to bring AI/ML for medical image analysis closer to the bedside. During the RSNA 2018 Arti cial Intelligence Summit, researchers, opinion leaders, and early adopters of AI CAD radiology innovations emphasized that development of Machine learning (ML) algorithms that are integrated into clinical practice is an essential step in improving radiology performance and ML algorithm quality (1), but few open source solutions to this problem have been reported.
The Radiologic Society of North America (RSNA) recently released a special report on clinical AI implementation and presented a road map for governance, including a framework for required infrastructure (2). Clinical AI/ML integration is necessary for deployment of commercial vendor-speci c CAD tools. But imaging departments may want to deploy and test locally developed algorithms as well, and this can be facilitated with open-source vendor agnostic methods (3). Such algorithms commonly employ research-grade code and models, are at the stage of preliminary testing and validation, and not ready for clinical use. However, researchers may want to evaluate generalizability on new cases as they arise in the clinical work ow or conduct prospective studies of diagnostic performance, prognostic utility, or user acceptance.
In radiology, criteria that need to be met for pre-clinical "shadow-mode" testing of AI/ML CAD tools include cross-platform and cross-domain integration, as well as data security and access. A vendoragnostic platform should be integrated with hospital imaging archival systems. In (4), Jansen et al., developed a vendor-agnostic EMPAIA (EcosysteM for Pathology Diagnostics with AI assistance) platform which is used for integrating AI applications into digital pathology infrastructures.
Within the radiology clinical work ow, imaging data are stored and visualized using commercial Picture Archiving and Communications Systems (PACS) using the Digital Imaging and Communications in Medicine (DICOM) standard which includes a large library of metadata to facilitate PACS interoperability.
Interoperability with the DICOM format is required for radiology AI/ML tasks. In (5), Sohn et al., released a vendor-agnostic PACS-compatible solution for integrating AI into the radiology work ow. They showed feasibility of their pipeline for 2D classi cation of breast density on mammograms.
As AI/ML methods improve along the technology readiness pipeline, clinical-translational teams working on precision imaging solutions will be increasingly interested in deploying trained cross-sectional imaging-based models that segment and volumetrically quantify pathology for pre-clinical evaluation on new cases in "real world" settings as they arise (2,6). Granular quantitative volumetric information can provide objective metrics for personalized decision-making and treatment planning in clinical work ows (6). Such quantitative visualization (QV) tools fall under FDA computer-aided diagnosis (CADx) or image processing and quanti cation (IPQ) Software as Medical Device (SaMD) designations (7).
Simple, modular, and open-source PACS integrated pipelines are needed that are tailored speci cally for segmentation and quantitative visualization tasks applied to cross-sectional imaging modalities. DICOM lacks the elegant design features of the NIfTI format for cross-sectional medical image processing and analysis (8,9). Conversely, PACS systems do not support data handling of NIfTI image volumes used as model input and output.
Each slice of a DICOM CT series is represented by a .dcm le, whereas the series is represented as a volume in a single .nii.gz le, and the NIfTI format is employed by segmentation algorithms. DICOM to NIfTI and NIfTI to DICOM conversion bridges the gap between clinical PACS and visualization for quantitative imaging (9). A listener and router are needed to handle the ow of data for le conversion, image processing, and viewing of coregistered segmentation masks and quantitative results. JSON les are needed to specify relevant DICOM metadata such as pathology type and display color for a given DICOM SEG (segmentation) object, and the DICOM SEG object needs to be associated with its original DICOM image series through a DICOM unique identi er (UID). The precise volume of pathology should be available to the end-user for a QV task, such as in the form of a DICOM structured report (SR) element.
An nnU-net backbone for multiscale tasks.-Automated precision diagnostics in cross-sectional imaging of the torso typically require multiscale DL solutions to address complex and heterogeneous pathology with highly variable volumes. DL has demonstrated promising performance on a large variety of medical image analysis tasks (10), however computer vision solutions for torso imaging have been late-comers due to challenges including small target to volume ratios and the highly variable size, appearance, and distribution of pathology (11). A variety of bespoke solutions have been employed for multiscale problems, including coarse-to-ne approaches, and dilated convolutional neural networks with attention modules (12)(13)(14)(15). In 2021, Isensee et al., introduced nnU-net (16), which uses a simple u-net backbone and systematizes design choices (including pre-processing, hyperparameter selection, and post-processing) based on the "data ngerprint" of the task at hand, a representation that considers voxel spacing, image size, class ratios, and other dataset-speci c features derived from 53 different segmentation tasks. The premise of nnU-net is that such design choices are a more important condition of high performance than architectural modi cations, hence the name "no new U-net". The method achieved state-of-the-art or performance on 23 public datasets in the Isensee et al., paper, and given its ease of implementation and robust performance for a wide variety of tasks, represents a watershed for out-of-the-box automated medical image segmentation. Given the ease of training nnU-net and its high performance, it is now widely used by many investigators. We wished to create an application for nnU-net with low complexity and easy outof-the-box deployment, while giving investigators the agency to swap in any segmentation algorithm code using NIfTI format for input and output.
In (8), Doran et al., integrated the Open Health Imaging Foundation (OHIF) viewer with the XNAT informatics software platform for quantitative imaging research based on the DICOM web protocol, with advanced features including paintbrush editing tools and integration with NVIDIA's AI assisted annotation (AIAA). Similarly, Monai label provides active learning and AIAA functionality and can be combined with robust quanti cation tools as a 3D slicer plug-in (17). The pipelines can be con gured with a variety of segmentation algorithms on the back end, including 3D U-Net, DynU-Net and UNETR. A sliding window patch-based method is typically employed to address local GPU memory limitations in training and inference, but this is not easily integrated with nnU-net default settings. To our knowledge, implementations of these tools with nnU-net on the back-end are not currently publicly available. Further, while new modules may be developed, the lack of a DICOM structured reporting element containing segmentation volumes with XNAT-OHIF, and lack of out-of-the-box DICOM compatibility with 3D Slicer (18) represented barriers for quantitative visualization in the clinical environment that motivated this work.
Traumatic pelvic hematoma use case. -Whole Body CT has become the routine diagnostic workhorse for admissions with major trauma (19,20), with potential associated survival bene t (21), but long interpretation times, ranging from 30 to 87 minutes remains a major bottleneck that limits rapid surgical decision-making (22,23). Volumetric measurements of hemorrhage are not feasible at the point of care without automation (24,25), and a recent scoping review found no commercial CAD tools for this purpose (6). A cross-sectional survey of practitioners in the Emergency/Trauma subspeciality reported a desire on the part of most respondents for automated quantitative visualization tools (26). Bleeding pelvic fractures are a leading cause of morbidity and mortality in trauma patients. Once we achieved high saliency visual results that correlated with patient outcomes (12), and further improved DSC using nnU-net, Shadow testing in the clinical environment became desirable for this task. CT volumetry has myriad applications beyond our use caseincluding objective assessment of malignancy progression with a higher level of precision compared to two-dimensional RECIST criteria (27)(28)(29); for measuring organ volumes and body composition parameters (30,31); and for a wide variety of other applications.
Purpose. -To meet the needs of the community of researchers in this domain, following FAIR ( ndable, accessible, interoperable, and reusable) principles, we aimed to construct and disseminate a simple and secure, modular, open-source and vendor-agnostic PACS-integrated and DICOM compatible pipeline for end-toend automated CT quantitative visualization suitable for nnU-net or any other segmentation algorithm. The feasibility of our approach for real-time shadow evaluation in the clinical setting was assessed using clock times for a cascaded nnU-net traumatic pelvic hematoma use case.

Materials And Methods
Software architecture. -In the proposed python-based client-server architecture, the study is pushed by an end user from a picture archiving and communications system (PACS) to a DICOM listener/router host where the DICOM series of interest is ltered, anonymized, and sent to 1) a DCM4CHEE query/retrieve archive associated with a zerofootprint Open Health Imaging Foundation (OHIF) DICOM web viewer running on a radiologist workstation, and separately to 2) a deep learning workstation host, where the DICOM series is converted to a NIfTI volume and processed by the DL segmentation algorithm. On this host, the output NIfTI segmentation mask is converted to a DICOM SEG object with a unique identi er (UID) linking to the original DICOM and sent back to the listener/router where the pixel data is used to create a DICOM structured report (SR) element with volumetric information. The DICOM SEG and SR are then routed to the DCM4CHEE query/retrieve archive for secure permission-based quantitative visualization using the OHIF viewer on a radiologist workstation, or other secure Windows-based desktop or laptop. nnU-net is used in our publicly available containerized software. The overall work ow is shown in Fig. 1.
Building blocks for data transfer and handling: listener/router, archive, and viewer 1. When an asynchronous request is made in PACS to send a study for processing and viewing (by selecting the listener/router node from a dropdown menu), PACS performs a C-STORE operation, which queues the study for transfer inbound to the router Application Entity (AE) Title. The listener/router includes a facility to lter for con g le-speci ed series. The study is anonymized according to the DICOM standard and queued up through its handler using C-STORE for transfer to two AE Titles-a DCM4CHEE archive (DCM4CHE library: https://www.dcm4che.org/) and an onpremises deep learning (DL) workstation host. The bespoke listener/router script utilizes the pynetdicom library and is constructed using our own high-level logic to t the required task.
2. The DCM4CHEE archive provides web services to the OHIF web viewer for retrieving data using the Java-based Web Access to DICOM Objects-RESTful services (WADO-RS) protocol. In short, the DCM4CHEE archive backs the OHIF viewer as the source of DICOM data storage and appears as a list of studies for permission-based viewing by the end-user. Use of a web viewer distinct from PACS is intended to prevent research-grade results from entering the patient's medical record. The PACS, the router, the DCM4CHEE archive, and the viewer all reside within virtual machine (VM) infrastructure running under the institution's secure intranet.
3. The deep learning (DL) workstation host is assigned a speci c IP address, AE Title, and port. The listener/router executes a C-STORE to the DL host AE Title using the pynetdicom library and has a time-out function that triggers a callback to the DL host to indicate that transfer of all DICOM images from the series of interest is completed. The time out begins with a C-MOVE operation to queue the job up and is followed by a C-STORE operation to a unique directory on the DL host workstation once the time out (from the last time the object of the DICOM series was received) completes, ensuring that NIfTI conversion does not occur prematurely. The callback, which is serialized and currently processed on a single thread, contingent on receipt of the DICOM SEG object to prevent concurrent processing, then triggers code for a series of DL host side conversion and image processing steps described in detail in the next subsection.  (Fig. 2). A serialized model from an open-source task (spleen segmentation) is provided in our GitHub repository due to institutional restrictions on our trauma CT data.
1. DICOM to NIfTI conversion. A listener script written in LINUX sends a command prompt to convert the collected DICOM les to a NIfTI volume after an adjustable delay time set to 30 seconds. The DICOM to NIfTI converter is implemented using the DICOM2nifti library (https://github.com/icometrix/DICOM2nifti). This building block is used to convert the DICOM sequence to a .nii le. After the data conversion is completed, the script executes model inference.
2. Trained model. Prior to deployment in our pipeline, all pre-processing, training, and post-processing steps were completed by nn-Unet in ve-fold cross-validation per speci cations of this selfcon guring method (16). We employed the 3D cascaded low-resolution to high-resolution nnU-Net architecture and model which gives state of the art performance for multiscale segmentation tasks. The pipeline can be recon gured with other networks in place of nnU-net to suit investigators' needs.
3. The segmentation output is converted from NIfTI to a DICOM SEG object linked to the original segmentation with a UID using the dcmqi (DICOM for Quantitative Imaging) library (https://github.com/qiicr/dcmqi) and dcmqi-created JSON le (Fig. 3), which also speci es target attributes such as "hemorrhage", pelvic hematoma, and the color of the mask or contour. Once created, the script calls a DCM4CHE toolkit installed on the DL host to perform a C-STORE operation, transferring the DICOM SEG object back to the router/listener discussed in the previous subheading.

Patient Dataset
Performance of pelvic hematoma segmentation in 5-fold cross-validation on 253 training cross-sectional CT studies with over 100,000 2D images is previously reported (13)

Clock times
To test our software pipeline, the 21 consecutive cases were pushed from PACS. Total clock times from the beginning of the PACS C-STORE operation to completion of transfer to the viewer were recorded, along with times for the following steps: 1) clock times for model inference, 2) clock times for all data conversion and transfer steps on the DL-host side, and 3) combined clock times for virtual and onpremises host steps without nnU-net inference.

Results
The software is available on our github repository (XXXX), with relevant links to our customizable listener/router docker container, DL host docker container, other components (DCM4CHEE container and OHIF viewer) and con g les for modi able site-speci c con guration of the listener/router (e.g., le names for ltering, time-out delay, AE titles, IP addresses, and ports). readme les are provided for documentation.
example visual results using the OHIF viewer (32) with the DICOM SEG mask overlaid on the linked anonymized DICOM series are shown (for pelvic hematoma segmentation, see Fig. 4; for splenic segmentation, see Fig. 5). A modular structured report element with a statement of "Splenic volume: 40 mL" is shown in Fig. 6.
Returning to pelvic hematoma, Mean total clock time in the 21 patients from PACS send request to completion of receipt of the DICOM SEG object and structured report in the DCM4CHEE archive was 5 min 32 sec (+/-SD of 1 minute and 26 seconds; min: 3 min 16 sec, max: 9 min, 2 sec). nnU-net inference times contributed to over 89% of the total time. Mean clock time for all other on-premises DL host steps totaled only 5.4 seconds (+/-SD: 1.3 sec: min: 2.0 sec; max: 7.0 sec). Excluding inference time, mean combined total clock time for all steps on the listener/router virtual infrastructure side and DL host side of the proposed software platform was 38.5 sec (+/-SD of 4.7 sec) ( Table 1). Pelvic hematoma volumes ranged from 28.0 to 924.7 mL (median: 301.5 mL, IQR [94,8, 491.7]). Pearson correlation r values between volumes and nnU-net inference times (r = 0.12 p = 0.61), between volumes and total clock times (r = 0.06, p = 0.8) and between the number of slices per DICOM series and clock times (r = 0.17, p = .5) were all poor, such that these factors had no discernable effect on processing times. DICOM SEG volumes corresponded exactly (to the nearest 1/10th of a milliliter) to those obtained from NIfTI volumes using the 3D slicer image computing platform quanti cation module (www.slicer.org, version 5.0.3). Interoperability with the public spleen model is illustrated in Fig. 6.

Discussion
There is a need for open-source software that integrates AI/ML algorithms into clinical work ows for preclinical evaluation [6]. Jansen et al., [9] developed a vendor-agnostic platform for integrating AI applications into digital pathology infrastructures, and Sohn et al., [13] introduced a vendor-agnostic platform for integrating AI into radiology infrastructures using breast density on 2D mammography as a use case. XNAT-OHIF integrates with DICOM [4] but to our knowledge and based on personal correspondence, did not provide quantitative volumetric information. Segmentation and quanti cation of pathology-such as advanced malignancy [14], lung nodule size [15], or COVID in ltrate volume [16] has generated considerable interest as potential precision medicine tools since manual segmentations are not feasible at the point of care, and there is considerable information loss and subjectivity associated with diameter-based measurements [17].
In this work, we address an unmet need for tools that integrate automated cross-sectional imaging segmentation results into a DICOM-based quantitative visualization clinical work ow. Since its introduction in 2021 [3], nnU-net has emerged as a widely-employed robust and easy to train method for segmentation tasks in medical imaging in the NIfTI format. Our open-source vendor-agnostic software is intended for clinical-translational researchers who wish to deploy their segmentation models in inference for further testing on new cases encountered in the clinical work ow using the DICOM standard. For those wishing to use cascaded nnU-net, the pipeline can be used out-of-the-box with relevant .pkl les.
On the virtual infrastructure host side, a router/listener anonymizes and handles DICOM series which are sent to a DICOM query/retrieve archive backing an OHIF web viewer, and to an on-premises single GPUbased DL workstation. On the DL host side, DICOM series are converted to NIfTI and processed by the segmentation algorithm. A NIfTI segmentation mask sharing the same UID as the DICOM les is converted to a DICOM SEG object and returned to the router/listener where a DICOM SR element containing segmentation volume (in mL) is created. The DICOM SEG and SR objects are then sent to the DICOM archive for viewing. The segmentation and quantitative information are thereby harmonized to the same format as the original DICOM data. The building blocks were implemented using publicly available open-source libraries, which made our software vendor-agnostic and easily deployable, along FAIR principles. By open-sourcing the proposed software, we encourage radiologists and radiology IT developers to integrate more data transfer functionality and more models into the clinical radiology work ow.
Radiologists should be able to receive veri able quantitative results well within CT report turnaround times should they wish, for example, to include this information in their reports within the framework of a prospective research study. We tested the software using 21 consecutive patients with traumatic pelvic hematoma. Clinical interpretation of WBCT scans for polytrauma or cancer staging typically exceeds 30 minutes, and results were available within a fraction of this minimum expected turnaround time in all cases.
Using our method, we achieved a mean clock time of 5 minutes and 32 seconds using a workstation with a single NVIDIA GeForce RTX 3090 Ti graphics card. This is approximately 1/5th of a typical report turnaround time for a patient undergoing WBCT for suspected polytrauma. nnU-net inference is responsible for over 89% of the clock time, and the time for all other on-premises DL host-side and virtual router/listener-side steps were found to be negligible, with a mean of only 38.5 seconds (which includes the 30 second time-out). We surmise that investigators will encounter similar or shorter clock times for less complex use cases.
There are limitations to our pilot study. We describe clock times for 21 patients on a single task. However, any algorithm or model can be used. We include a publicly available nnU-net model for spleen segmentation (pretrained nnU-net model Task009_Spleen) in our GitHub link to initially operationalize the deployed pipeline. In the future, end-users may wish to have an "always-on" system that sends the series of interest for every patient directly from a scanner AE Title. Given the lag time associated with postprocessing, study completion by the technologist, and transfer from the scanner to PACS, sending a given series from the scanner on creation could result in substantial time savings, however this may not be desirable without an initial rapid detection or classi cation step to separate positive from negative studies for a given feature of interest. We are currently working on these steps for our problem and plan to release future updates. Sending a study from PACS to the listener/router node selected from a dropdown menu is currently the only manual step. To simplify the process, we are working on an integrated PACS icon. We are also exploring solutions for pop-up noti cations and auto-population of quantitative results in radiology reports. Our method currently employs nnU-net and investigators wishing to implement other segmentation algorithms and models will need to make minor modi cations to our code.
In conclusion, we have developed and released a simple open-source vendor-agnostic PACS and DICOM compatible software package for automated quantitative visualization with nnU-net. The method, intended to promote "shadow evaluation" using new cases in the clinical setting, approximates FDAdesignated IPQ or CADx quantitative volumetry-based CAD tools, and is meant to help advance the application of precision medicine principles for cross-sectional imaging.

Declarations
Competing interests: none.  Diagram of the end-to-end work ow.

Figure 2
DL host building blocks. The DL workstation receives DICOM images. A 30 second time-out triggers a call-back from the router indicating that the DICOM series has been completed received, and DICOM to NIfTI conversion is then initiated. The NIfTI le is fed into the DL model (i.e., nnU-net) for inference. The output label is converted to a DICOM SEG object and returned to the router. Creating a JSON le for NIfTI to DICOM conversion using dcmqi.

Figure 4
Client-side display of DICOM images, segmentation mask, and a structured report element including pelvic hematoma volumes in mL. Figure 5 illustrates interoperability using various models on the back-end. In this case, the pelvic hematoma model was swapped out for a public model trained on the spleen segmentation dataset (task_009 spleen) from the public nnU-net repository ( https://zenodo.org/record/3734294.)