Multi-compartment model

The diffusion signal *S(b)* observed in the human body can generally be described as a superposition of multiple individual diffusion processes with various diffusion coefficients resulting in a multi-exponential decay function [15]:

$$S\left({b}_{i}\right)={\sum }_{i=1}^{M}{f}_{j}{e}^{-{b}_{i}{D}_{j}}, j=1, 2,\dots ,N \left[1\right]$$

where \({b}_{i}\) is the diffusion weighting b-value in mm²/s, *M* is the total number of measurements *i* with different b-values, *f**j* are the amplitudes of the exponential component with the diffusion coefficients \({D}_{j}\). In the following, *f**j* is denoted as the volume fraction, although this nomenclature is mathematically inaccurate due to the omission of correction for relaxation times and the exclusion of the prevailing proton density. *N* is the number of diffusion components. Typically, mono- or bi-exponential models are used to describe DWI data of certain organs. However, recent studies have questioned the correctness of bi-exponential fitting for the diffusion signal, particularly for renal tissue [9, 16, 8, 11]. It appears that a three-compartment model, considering the tubular volume fraction, is physiologically more appropriate in the kidney. By applying the tri-exponential fitting model the measured diffusion signal \(S\left({b}_{i}\right)\) can be defined as:

$$S\left({b}_{i}\right)={\sum }_{i=1}^{M}{f}_{j}{e}^{-{b}_{i}{D}_{j}} , j\in \left[1, 3\right] (tissue,tubule,blood) \left[2\right]$$

The diffusion coefficients \({D}_{j}\) can be assigned to diffusion processes in tissue, tubules and blood. In detail these three components relate to the restricted diffusion of water molecules in renal tissue, the pseudo-diffusion occurring inside tubules and the pseudo-diffusion component present in blood vessels. They are often referred to as the slow, intermediate and fast diffusion component, respectively. The same classification accounts for the three different volume fractions *f**j* accordingly.

The sum of all volume fractions in multi-compartment models adds up to \({\sum }_{j}^{N}{f}_{j}=1\). Therefore, the diffusion signal *S(b**i**)* for a three compartment model can be defined as:

$$S\left({b}_{i}\right)=\sum _{i=1}^{M}{f}_{slow}{e}^{-{b}_{i}{D}_{slow}}+{f}_{inter}{e}^{-{b}_{i}{D}_{inter}}+\left(1-{f}_{slow}-{f}_{inter}\right){e}^{-{b}_{i}{D}_{fast}} \left[3\right]$$

Non-linear least square fitting

The established tri-exponential model described in Eq. [3] has been incorporated into a rigid NLLS algorithm constructed around the *lsqnonlin* function of MATLAB (The Mathworks Inc., Natick, USA). To fit the synthetic signal (construction details provided below), the standard trust-region algorithm [17] was employed. Initial parameter values for D, f and boundary conditions of the non-linear fit need to be declared a priori. Hence, standard starting values and parameter ranges were chosen in accordance with literature [13], as summarized in Table 1. Following the fitting process, the NLLS algorithm produces discrete optimal values for Dj and fj with respect to the corresponding signal input.

Table 1

– Ground truth values required for the generation of synthetic multi-exponential diffusion decay data and standard simulation parameters.

Ground truth values |

Diff. coefficient blood (\({d}_{fast}\)) | 165 x 10− 3 mm²/s |

Diff. coefficient tubule (\({d}_{inter}\)) | 5.8 x 10− 3 mm²/s |

Diff. coefficient tissue (\({d}_{slow}\)) | 1 x 10− 3 mm²/s |

Vol. fraction blood (\({f}_{fast}\)) | 0.1 |

Vol fraction tubule (\({f}_{inter}\)) | 0.3 |

Vol. fraction tissue (\({f}_{slow}\)) | 0.6 |

**Standard simulation parameters** |

b-values [16] | [0, 5, 10, 20, 30, 40, 50, 75, 100, 150, 200, 250, 300, 400, 525, 750] |

SNR | 140 |

Iterations | 1000 |

**Standard NNLS parameters** |

M | 350 |

\({D}_{min}\) | 0.7 x 10− 3 mm²/s |

\({D}_{max}\) | 300 x 10− 3 mm²/s |

**Standard starting values NLLS** |

Diff. coefficients | [1.5, 30, 100] x 10− 3 mm²/s |

Volume fractions | [0.50, 0.25, 0.20] |

Non-negative least square algorithm

To analyse the diffusion signal with the NNLS approach, an implementation of the algorithm of Lawson and Hanson [14] was employed. A fitting of the synthetic renal DWI data was accomplished with an in-house software based on the *lsqnonneg* MATLAB function. The advanced regularization method of NNLS was implemented based on the open-source multi-exponential decay image analysis software AnalyzeNNLS from Bjarnason [18].

In the NNLS algorithm, the signal decay is expressed as a superposition of exponentials, similar to Eq. [1][15]:

$${y}_{i}={\sum }_{j=1}^{M}{A}_{ij}{s}_{j}, i=1, 2,\dots ,N \left[4\right]$$

with the constraint matrix \({\text{A}}_{\text{i}\text{j}}\) representing the exponentials and \({s}_{j}\) as the corresponding amplitudes for M logarithmically spaced diffusion coefficients at N diffusion components. An inverse of \({\text{A}}_{\text{i}\text{j}}\) cannot be derived due to noise contained in the signal \({y}_{i}\), resulting in an ill-posed problem. The NNLS algorithm is then used to minimize the minimal least squares \({{\chi }}^{2}\) misfit between the measured (or simulated) and modelled data

$$\chi ²=min\left[{\sum }_{i=1}^{N}{\left|{\sum }_{j=1}^{M}{A}_{ij}{s}_{j}-{y}_{i}\right|}^{2}\right], \left[5\right]$$

while all amplitudes are implicitly defined as non-negative, stating \({s}_{j}\ge 0\) [15].

Unlike non-linear optimization methods, the NNLS algorithm does not require a priori information or an initial guess of variables to solve Eq. [4]. As an output, NNLS yields amplitudes for the M exponential functions for each diffusion coefficient \({D}_{j}\). To obtain a more physiologically realistic representation, the least-square algorithm can be adapted to construct a continuous spectrum. By incorporating extra constraints into the matrix \({A}_{ij}\) one is able to alter the discrete character of the basic NNLS solution. By introducing the regularization term \(\mu\), Eq. [5] can be adjusted accordingly[18]:

$${\chi }^{2}=min\left[{\sum }_{i=1}^{N}{\left|{\sum }_{j=1}^{M}{A}_{ij}{s}_{j}-{y}_{i}\right|}^{2}+\mu {\sum }_{j=1}^{M}{\left|{s}_{j+2}-2{s}_{j+1}+{s}_{j}\right|}^{2}\right] \left[6\right]$$

The weighting factor \(\mu\) serves as a smoothing constraint that affects the curvature of the NNLS solution spectrum and ensures a robust fit. Larger \(\mu\) values result in smoother distributions satisfying the constraints at the expense of increasing misfit. For \(\mu =0\) this formula yields the least square solution \({{\chi }}_{\text{m}\text{i}\text{n}}^{2}\) from Eq. [5].[18]

The outcome of the regularized NNLS fitting entails various exponential terms, which correspond to the diffusion components identified in the signal decay curve. By plotting the associated diffusion coefficients with respect to the amplitudes, distinct peaks become evident. Individual peaks can be characterized by assessing their maximum and area under curve. This allows for the derivation of the geometric mean D and volume fraction f of the contributing exponential constituents.

Combined non-linear and non-negative LS algorithms

A unique approach was employed by combining both NNLS and NLLS methods to create a two-level analysis of the diffusion signal to overcome the starting value limitation of NLLS and thereby increase the accuracy of the fitting results. For this purpose the NLLS algorithm utilises the fitting results obtained by the NNLS algorithm as initial parameters, resulting in an advanced approach referred to as NLLS*.

Advanced NNLS algorithm with AUC constraint

In addition to the standard NNLS algorithm, we implemented an advanced fitting algorithm called NNLSAUC. It is based on the same fitting results as standard NNLS and incorporates an AUC constraint post-fitting. This modification aims to minimise the influence of inaccurately identified peaks and noise interferences. To achieve that, the NNLSAUC technique applies adaptable interval boundaries based on estimated physiological compartment ranges for the three diffusion regimes. In cases where multiple peaks are encompassed by these intervals, a weighting factor based on their respective volume fraction is applied to combine them into a single representative peak.

Simulation and reconstruction

To simulate the underlying synthetic renal diffusion data, we followed the methodology outlined in Eq. [3] and used the ground truth (gT) values presented in Table 1. The initial parameters used for the diffusion coefficients d and volume fractions f are based on the physiological conditions in the human kidney, considering the presence of three diffusion compartments [9, 13, 11]. Subsequently, the multi-exponential diffusion signals were superimposed with Gaussian noise for each b-value on a random basis, ensuring an authentic artificial signal decay with variable signal-to-noise ratio (SNR). The SNR was defined by dividing the signal at the first b-value with b = 0 s/mm² by the standard deviation of the added noise [13, 19]. For better comparison, the same simulated data was then analysed using the different algorithms mentioned above, namely NNLS and NLLS (Fig. 1). In the context of the simulation, the diffusion coefficient parameter is denoted as d, in order to eliminate any potential confusion with the limits of the NNLS fitting range Dmin or Dmax.

Values of previous works utilizing NLLS fitting were used as starting parameter [9, 20, 13, 11, 21]. The NNLS diffusion fitting range was set accordingly, to encompass the entire physiologically relevant spectrum [13, 11]. The total number of fitting iterations for all simulations was n = 1000.

Parameter variation

For parameter variations, main emphasis lay on altering values within a range relevant for routine examinations of the kidneys and feasible for research imaging experiments.

The variation in SNR ranged from 50 to 600, with primary focus on the interval between 100 and 140, reflecting the most commonly observed SNR values of routinely acquired DWI [22, 23].

In addition to the standard logarithmic distribution, other b-value compositions have been evaluated to find a suitable distribution for NNLS fitting. This work evaluates two of these other b-value compositions: an equidistant distribution and an interval distribution. The latter applies a dense concentration of b-values inside the three diffusion regimes. All b-value distributions span the same range, with bmax = 750.

To investigate the impact of the number of logarithmically spaced diffusion coefficients M on the NNLS fitting process, M was varied in increments of 50, resulting in a range of 100 to 600 possible exponential components for NNLS. Particular attention laid on the interval around M = 300, covering commonly utilized values of prior studies [13, 11].

The fitting results obtained from NNLS should be self-contained from any variations made regarding the range of possible diffusion coefficients, as defined by the choice of Dmin and Dmax. However, altering the discrete fitting range for non-negative approaches exert a significant influence, posing a common problem. Therefore, we compared our standard range (Table 1) to a shortened and an extended version. The shortened fitting range spans from Dmin = 0.9 x 10− 3 mm²/s to Dmax = 200 x 10− 3 mm²/s and the extended range encompasses values between Dmin = 0.5 x 10− 3 mm²/s and Dmax = 500 x 10− 3 mm²/s [13, 11]. A comprehensive summary of the complete parameter variations can be found in Table 2.

Table 2

– Values and sets of varied fitting parameters for NNLS.

Parameter | Variation |

SNR | 50, 100, 110, 120, 130, 140, 600 |

Equidistant b-value distribution | 0, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750 |

Interval b-value distribution | 0, 5, 10, 15, 20, 30, 50, 100, 150, 200, 250, 350, 450, 550, 650, 750 |

M | 100, 200, 250, 300, 350, 400, 600 |

Shortened Dmin and Dmax | [0.8–200] x 10− 3 mm²/s |

Extended Dmin and Dmax | [0.5–500] x 10− 3 mm²/s |

Statistics

In order to compare the quality of the simulation results and depict the deviation from the gT, the Median Absolute Percentage Deviation (MAPD) was computed. The MAPD is determined by calculating the absolute difference between the parameter estimates d and f and the gT values for all n = 1000 iterations, expressed in percent:

$$MAPD\left({x}_{i}\right)=\frac{100}{g{T}_{i}} median({|x}_{i}-{gT}_{i}|),$$

Here *x**i* represents one diffusion parameter estimate and *gT*i the corresponding ground truth value.

To ensure more robust results, the utilization of the median was preferred over traditional mean values. This approach bypasses the strong influence of outliers, themselves being heavily biased by the choice of constraint boundaries, the latter only being applicable to NNLS algorithms [24, 21].

Moreover, statistical analysis were carried out using appropriate MATLAB implementations. The visualisation of the data and additional statistical measures were executed using in-house developed software in R (version 4.1.3, R Foundation for Statistical Computing, Vienna, Austria).