3.2 Problem Description
The optimization objective of low-light image enhancement is to input two sets of images (low-light image, normal-light image) using a neural network algorithm to learn the mapping relationship from the low-light image to the normal-light image, which can be expressed by the following equation:
$$\begin{array}{c}{I}^{{\prime }}=F\left({I}_{low},{I}_{normal}\right)\#\left(3-1\right)\end{array}$$
where \({I}^{{\prime }}\) denotes the image after enhancement, \(F\left(\cdot \right)\) denotes the corresponding low-light enhancement neural network, \({I}_{low}\) denotes the low-light image, and \({I}_{normal}\) denotes the normal-light image. there are various representations of \(F\left(\cdot \right)\) to achieve the enhancement of low-light images, such as the histogram equalization method based on traditional histogram equalization method HE of digital image processing algorithms [15], methods based on Retinex model [2, 5–9], etc. In this paper, the main recovery using Retinex model is divided into three main stages, which are image decomposition (Decomposition), illumination map enhancement (Illumination), reflection map recovery (Reflectance), and restoration of the image. In the Decomposition stage, the input image is decomposed into Illumination and Reflectance maps for subsequent processing.
$$\begin{array}{c}I,R={F}_{decom}\left({I}_{in}\right)\#\left(3-2\right)\end{array}$$
where \(I\) denotes the illumination map, which reflects the different brightness details of the image in different areas. \(R\) denotes the reflection map, which reflects the contour details of different parts of the image after removing the brightness representation.\({F}_{decom}\left(\cdot \right)\) denotes the decomposition network, and \({I}_{in}\) denotes the input image. The input image is decomposed into illumination map and reflection map by Decomposition, and the illumination map and reflection map are processed separately. The decomposition process requires the use of loss function optimization, which is generally used to optimize the MSE loss function and derive the minimum value of the loss function. The formula is expressed as follows.
$$\begin{array}{c}{L}_{mse}={‖\widehat{I}-{I}_{p}‖}_{2}^{2}\#\left(3-3\right)\end{array}$$
where \(\widehat{\text{I}}\) denotes the normal-light image, \({\text{I}}_{\text{p}}\) denotes the predicted image. Assuming that the parameters of the k-th layer are \({W}^{\left(k\right)}\) and \({b}^{\left(k\right)}\), the final optimization objective is
$$\begin{array}{c}Loss={min}\left(0,\frac{\partial {L}_{mse}}{\partial {W}^{\left(k\right)}}\right)\#\left(3-4\right)\end{array}$$
where \({\text{W}}^{\left(\text{k}\right)}\) denotes the weight of model, and \({\text{b}}^{\left(\text{k}\right)}\) denotes the bias of model.
For the Illumination stage, the illumination maps decomposed from low-light images and normal illumination are input separately, and the enhanced illumination maps are obtained after network optimization enhancement to learn the luminance mapping relationship from low-light illumination maps to normal illumination maps. The formula is expressed as.
$$\begin{array}{c}{I}_{illum}={F}_{illum}\left({I}_{low},{I}_{normal}\right)\#\left(3-5\right)\end{array}$$
where \({I}_{illum}\) denotes the enhanced illumination map, \({I}_{low}\) denotes the illumination map obtained by decomposing the low-illumination image, and \({I}_{normal}\) denotes the illumination map obtained by decomposing the normal-illumination image, \({F}_{illum}\left(\cdot \right)\) denotes the network of the enhanced illumination map, which will accept illuminance maps under low light conditions and illuminance maps under normal light conditions to produce enhanced illuminance maps under low light conditions. The loss function is also used for optimization.
At the same time, the reflectance map obtained from the low-light image and normal light decomposition are input in the Reflectance stage respectively, and the information of the low-light reflectance map, including details and edges, is recovered by the neural network. The formula is expressed as.
$$\begin{array}{c}{R}_{reflect}={F}_{reflect}\left({R}_{low},{R}_{normal}\right)\#\left(3-6\right)\end{array}$$
where \({R}_{reflect}\) denotes the recovered reflectance map, \({R}_{low}\) denotes the reflectance map obtained by decomposing the low-light image, and \({R}_{normal}\) denotes the reflectance map obtained by decomposing the normal-light image, \({F}_{reflect}\left(\cdot \right)\) denotes the network to recover the reflectance map, which will accept reflection maps under low light conditions and reflection maps under normal light conditions to produce enhanced reflection maps under low light conditions. The loss function is also used for optimization. Finally, the outputs of Illumination and Reflectance stages are fused to obtain.
$$\begin{array}{c}{I}_{out}={I}_{illum}\cdot {R}_{reflect}\#\left(3-7\right)\end{array}$$
Low-light image enhancement optimizes the brightness of the image by decomposing the image into illumination map and reflection map to achieve the decoupling effect, brightness enhancement of the illumination map to achieve the recovery of the brightness of the original image, and recovery of the reflection map to avoid the introduction of excess noise.