Smart and real-time image dehazing on mobile devices

Haze is one of the common factors that degrades the visual quality of the images and videos. This diminishes contrast and reduces visual efficiency. The ALS (Atmospheric light scattering) model which has two unknowns to be estimated from the scene: atmospheric light and transmission map, is commonly used for dehazing. The process of modelling the atmospheric light scattering is complex and estimation of scattering is time consuming. This condition makes dehazing in real-time difficult. In this work, a new approach is employed for dehazing in real time which reads the orientation sensor of mobile device and compares the amount of rotation with a pre-specified threshold. The system decides whether to recalculate the atmospheric light or not. When the amount of rotation is little means there are only subtle changes to the scene, it uses the pre-estimated atmospheric light. Therefore, the system does not need to recalculate it at each time instant and this approach accelerates the overall dehazing process. 0.07 s fps (frame per second) per frame processing time (~ 15 fps) is handled for 360p imagery. Frame processing time results show that our approach is superior to the state-of-the-art real-time dehazing implementations on mobile operating systems.


Introduction
Image and video dehazing are crucial for offline and online computer vision applications needed in security, transportation, video surveillance and military. Consequently, the number of studies related to image enhancement has steadily increased in recent years [1]. Image dehazing is a kind of image enhancement; however, it varies from others due to changes in image deterioration regarding the scene distance from the observation point and the amount of haze globally and locally. In other terms, as the distance between the sensor and the scene increases the thickness of the haze also increases and the transmission of the media decreases. Likewise, when the density of haze is high and differs locally, the complexity of dehazing process increases. To illustrate, Fig. 1 displays two hazy and haze-free (dehazed) image pairs. Image (a) is a hazy image, and (b) is the result of haze removal process applied to (a). Similarly, (c) is the hazy image and (d) is the haze-free pair of (c). Since the thickness of the haze is higher in the second image pair, haze removal operation is less effective, and the visual quality of the dehazed image is poor.
There are many ways of image dehazing and they can be grouped into three categories which are contrast enhancement [2][3][4][5], restoration [6][7][8][9][10] and fusion based [11][12][13][14][15] methods. Contrast enhancement approaches aim to improve the visual quality of the hazy images to some degree; however, they cannot eliminate the haze efficiently. The subcategories of image enhancement models are histogram enrichment [16][17][18] which can be applied locally and/or globally, frequency transform methods: wavelet transform, and homomorphic filtering, and the Retinex method: single and multi-scale Retinex [19]. Restoration based methods focus on recovering the lost information by modelling the image degradation model and applying inverse filtering.
Since this study is based on the application of image dehazing in real time, the specifics of dehazing methods will not be covered. On the other hand, ALS (atmospheric light scattering) model which is shown in Fig. 2 is used as the basis of our method. Equations 1-3 which were adopted from the study in [20] express atmospheric light scattering model, where I(x, ) is the hazy image, D(x, ) is the transmitted light through the 1 3 haze (after the reflection from the scene) and A(x, ) is the air light which is the reflected atmospheric light from haze. The sensor integrates the incoming light and the resulted imagery is the hazy image. In Eq. 2, t(x, ) is the transmission map of the hazy scene, R(x, ) is the reflected light from the scene and L ∞ is the atmospheric light. The transmission term is expressed as e − ( )d(x) , where d(x) is the depth map of the scene and ( ) is the atmospheric scattering coefficient with respect to wavelength. It can simply be understood from Eq. 3 that, when the depth from the sensor increases, the transmission decreases and vice versa.
The key point of ALS is the accurate estimation of the transmission and the atmospheric light. DCP (The Dark Channel Prior Method) [21] is one of the most commonly used methods in which the per-pixel dark channel previous is used for haze estimation. At the same time, for measuring the atmospheric light, quadtree decomposition is applied. Another research that uses the DCP as its basis is [22]. In this study, both per-pixel and spatial blocks are used for calculation of the dark channel.
Recent approaches on image dehazing is mostly based on artificial intelligence approaches which mostly use deep learning models [23][24][25]. In [26] a deep architecture is developed using Convolutional Neural Network (CNN) and a new unit called "bilateral rectified linear unit" is added to the neural network. It reports that it achieves superior results compared to previous dehazing studies. The study in [27] employs an end-to-end encoder-decoder CNN architecture to handle the haze-free images.
(1) There are many successful image dehazing studies in the literature. However, when the focus is real-time implementation, many bottlenecks such as the complexity of the algorithms, hardware constraints and high financial costs should be considered. Nonetheless, there have been several successful studies underway. The study in [28] estimates the atmospheric light using super-pixel segmentation and applies a guidance filter to refine the transmission map. It mentions that more accurate results compared to other stateof-the-art models are handled. The study in [29] proposes parallel processing dehazing method for mobile devices and achieves 1.12 s per frame processing time for HD imagery on a Windows Phone using Central Process Unit (CPU) and GPU together. The study in [30] uses DCP but substitutes guided filter with mean filter to increase the processing speed. It reports 25 fps over C6748 pure Digital Signal Processing (DSP) device [31].
The study in [32] converts hazy Red-Green-Blue (RGB) image to Hue-Saturation-Value (HSV) colour space and applies a global histogram flattening on value component, modifies the saturation component to be consistent with previous reduced value and applies contrast enhancement on value component. It achieves 90 ms dehazing time for HD imagery on Graphics Processing Unit (GPU). The study in [33] conducts 2 level image processing with a smart way. It first applies histogram enhancement and if the resulted image meets the system requirements then no further action is taken. If it does not, then DCP is used to remove the haze. Using a smart way, it saves a lot of time and achieves realtime processing.
The study in [34] uses locally adaptive neighbourhood and calculates order statistics. Using this information, it produces the transmission map and handles the haze-free image. The study in [35] parallelizes the base Retinex model and decompose the image into brightness and contrast components. For restoration of the image, it applies gamma correction and non-parametric mapping and reports 1.12 ms processing time for 1024 × 2048 high resolution image on parallel GPU system. The study in [36] constructs a transmission function estimator using genetic programming. Then this function is used for computing the transmission map. Transmission map and hazy image are used to obtain the haze-free images. The system runs with high-rate processing time on synthetic and real-world imagery.
Another successful real-time dehazing method is implemented in [37]. A novel pixel-level optimal dehazing criteria is proposed to merge a virtual haze-free image series of candidates into a resulted single hazy-free image. This sequence of images is calculated from the input hazy image by exhausting all possible values of discreetly sampled depth of the scene. The advantage of this method is the computing any single pixel position independent of the others. Therefore, it is easy to implement this method using a fully parallel GPU system.
The literature is very rich about dehazing the single image and the video. Implementations in real time are also very interesting. However, real-time processing is very rare on mobile devices such as Android and IOS. The study in [29] implements real-time dehazing on a Windows phone. This study is also one of the benchmark studies in this paper in which the results of the proposed study is compared. In this paper, DCP-based algorithm is implemented on a mobile android operating system with reading the sensor data from the device's orientation sensor. A smart way which determines the run time of re-estimation of atmospheric light is created. If system movement is measured as minor which means that the scene does not shift roughly, the previous ambient light is used to dehaze the imagery. If the movement exceeds some predetermined threshold then the estimation will be done once. Using this smart strategy, it is possible to achieve promising time gain in processing. On the other hand, transmission is based on the depth map and minor changes of orientation also lead to major changes on the depth map, so on the transmission map. Therefore, the transmission map is always calculated for each time instant.
The rest of the paper is structured to clarify the details of the proposed approach and its implementation in real time in Sect. 2. The average real-time processing results and the benchmark table with some other real-time studies are given in Sect. 3. Section 4 is the final part and some guidelines on some potential future studies relating to real-time dehazing are included.

Proposed method
In this study we improve the algorithm introduced in [22] by adding a smart decision method for atmospheric light calculation. DCP approach, information fidelity, and image entropy are used to estimate atmospheric light and map transmission. The steps are prior estimation of the dark channel image, estimation of the atmospheric light, estimation of the transmission, refinement of the transmission with guided filter and reconstruction of the haze-free image by applying Eq. 2.
The study in [22] provides very promising accuracy results. The benchmark scores for two different hazy images are given in Tables 1 and 2. The images and the visual results of different methods are given in Fig. 3. In Tables 1 and 2, the comparisons are done based on the colorfulness, Global Contrast Factor (GCF) and visible edge gradient. The visible edge gradient measures the visibility using the restored and hazy images. It has three indicators e , r and , where e is the amount of visible new edge after dehazing, r is the average ratio of gradient norm values at visible edges, and is the percentage of pixels after processing which becomes black or white.
The quality of dehazed images improves when gets smaller and the other indicators gets bigger. Although Kopf's method [39] shows good performance in closerange regions, it is not successful in far-range. Because it cannot remove the haze effectively. As GCF and r scores, Kopf's algorithm provides promising results; however, it is not satisfactory for colorfulness and scores. In addition, He's method [40] has limited performance, since it has good scores only for GCF and . Park's study [22] provides better results for overall evaluations.
Park's method is successful and effective to be improved for real-time implementation.
In this study, firstly, the amount of time spent for atmospheric light estimation and the other steps of dehazing algorithm is calculated. 50 hazy images are used with various amount of haze and resolution. The atmospheric light estimation step covers most of the processing time spent with a mean percentage of 78%. Therefore, by measuring the orientation and calculating the atmospheric light in a smart manner, the proposed approach presents its value and contribution to related literature.
The overall system diagram for the proposed method is shown in Fig. 4. Note that to prevent possible synchronization problems, dehazing operation is implemented once atmospheric light, transmission map and camera data is handled. AOO term in Fig. 4 stands for 'Amount of Orientation'. Since the device can rotate in 3D space, all possible pitch, yaw and roll angles are checked in the data controller. If any of them is above a predetermined threshold, the atmospheric light and transmission map is recalculated. If not, then the atmospheric light of the previous time instant is used and only the transmission map is calculated. Finally, the dehazing module reconstructs the dehazed image using camera data, atmospheric light and transmission map. Dehazed image data is displayed on the device screen in real time.
The optimal AOO threshold is determined empirically. Determining the optimal threshold is the core of the proposed study, because the atmospheric light estimation is the most important step for a high-quality reconstruction process. To determine the optimal AOO for each axis, following steps are applied: 1. The clear imagery of the scene is captured from 2 m distance by fixing the place of the android device.  The device is rotated up to 20°, only towards one direction, by a step size of 2° on the axis pitch, yaw and roll and the imagery for each step size is captured. 3. Haze is produced using dry ice and hot water and step 2 is repeated. One example of clear and hazy imagery pair is shown in Fig. 5. 4. For each hazy image, the haze-free partner is reconstructed both using [22] with calculation of atmospheric light for each step and using the same atmospheric light which was calculated once at beginning. So, we have TS (threesome) of 11 clear, haze free with [22] and haze free with [22] in the case of using the same atmospheric light. The TS images are named as TS-1 ( TS 1 ) , TS-2 ( TS 2 ) , …, TS-11 (TS 11 ) . Threesome members are TS (x)1 , TS (x)2 and TS (x)3 , respectively, where x denotes the threesome index number. Peak Signal to Noise Ratio (PSNR) which is based on the mean squared error, is one of the mostly used metric for measuring the similarity of the restored image to ground-truth [41,42]. Therefore, PSNR is used in this study to measure the similarity between the clear image and dehazed image to determine an orientation threshold. 5. PSNR between each clear and haze-free images is calculated and named as PSNR TS (1)1 , TS (1)2 a n d PSNR TS (1)1 , TS (1)3 . F o r i n s t a n c e , i f PSNR TS (1)1 , TS (1)3 do not reduce by 20% compared to PSNR TS (1)1 , TS (1)2 , then the next threesome is processed and same calculation is done for PSNR TS 21 , TS 22 and PSNR TS 21 , TS 23 and so on. Note that for each following threesome, the atmospheric light which was calculated in the dehazing process of TS (1)3 is used for reconstruction of TS (x)3 . 6. When the PSNR TS (x)1 , TS (x)3 drops 20% below of PSNR TS (x)1 , TS (x)2 , we choose the optimal rotation value as the rotation value of the image TS (x−1)1 . An example of threesome is given in Fig. 6. This shows an example of the dehazing results, where the PSNR TS (x)1 , TS (x)3 drops 20% below of PSNR TS (x)1 , TS (x)2 . The change of PSNR values for yaw, roll, pitch axis with respect to threesome index is given in Tables 3, 4 and 5. These tables show the PSNR as the rotation of the device changes. Starting from zero, for each two change of orientation, a new dehazed image is reconstructed and PSNR between dehazed image pairs is re-calculated. The orientation angle is increased by two at each step and since the PSNR tolerance is chosen as 20%, this is continued up to the PSNR TS (x)1 , TS (x)3 drops 20% below of PSNR TS (x)1 , TS (x)2 . This procedure is repeated for each of yaw, roll and pitch axis. The same procedure is repeated for each pitch, roll and yaw axis. The founded optimal rotation angles for 3 different scenes are given in Table 6.
According to Table 6, the optimal rotation angle is 12°.

Design on MATLAB and deploying on android OS
In the literature, to now, there is no complete work on dehazing on the Android operating system in real time. In this study, MATLAB SIMULINK is used for implementation of the proposed method. MATLAB SIMULINK has Android device support for developing and deploying MATLAB codes and MATLAB SIMULINK models [43]. The SIM-ULINK model developed is given in Fig. 7. A Simulink block called "FromAppMethod" is used which named as 'readOrientation' and coded in Android studio. This function reads and outputs the android device's actual and preceding time orientation data in real time. Since the 'size' output is not needed, it is terminated. On  the other hand, the 'Android Camera' block reads live video from the device's camera. The camera resolution can be set by also using this block. Real-time video and orientation data are fed to the 'Image dehazing' function that compares the previous and current orientation data and runs the proposed dehazing algorithm. The next block in Simulink is the image type conversion block which converts its input's type to double. 'Image splitting' block splits the RGB image to its color components R, G and B. Then these components are displayed on device screen using 'Android Video Display' block. This project is deployed on an Android device using 'Android Studio' [44]. By the way, The MATLAB codes are transferred to C + + code and a Java code is produced for user updates and declaring new functions. The android device used has Qualcomm® Snapdragon™ 665 Octa-core processor, which has frequency up to 2 GHz. It has 3 GB RAM. The camera's video resolution is up to 4 K at 30 fps.
The pseudo code of the proposed method is:

Results
Theoretically, the most time-consuming part of a dehazing algorithm is the estimation of atmospheric light. Therefore, in this study, the author focuses on finding a logical way of not estimating the atmospheric light for each time instant and a novel approach which is based on reading the orientation sensor of a mobile device and measuring the amount of orientation prior to the dehazing operation is proposed. The atmospheric light has not very high variance in many scenes. It may change from time to time according to the weather conditions; however, for a specific time interval it does not change so much within some specific rotation limits. This assumption is validated in this study by applying it on three different scenes and the detail procedure is introduced in the 'Proposed Method' section. Therefore, determining a smart way to measure the rotation of the device enables to reduce the processing time. This is achieved by estimating the atmospheric light when required instead of estimating it for each time instant. By that way, this study provides promising real-time processing speed. The mean results are shown in Table 7 for different imagery resolutions. From Table 7, the mean processing time for High Definition (HD) imagery changes from 0.42 to 1.95 s per frame.
This mean time depends on the amount of movement of the device. If the amount of movement is high which means for a specific time interval the device is rotated much above the optimal rotation angle, then the number of recalculation of atmospheric light will be high and the processing time per frame will increase towards 1.95 s. However, if the rotation is less, then the mean processing time will go down to 0.42 s.
Secondly, as the resolution of imagery increases, since the number of pixels increases, the processing time per frame also increases automatically.
Another important point is the threshold for atmospheric light re-estimation. There is an optimal value of the threshold which depends on the day of the year, cloud rate and other environmental effects. In our tests, the threshold is empirically set to 12° optimally which keeps the reconstructed image visual quality and PSNR. Note that this threshold was determined empirically and should be set depending on the conditions at the time of dehazing.
The processing time results of this study is compared with the results of the studies in [29,30,32,35]. The benchmark results are given in Table 8.
Note that the studies [30,32] and [35] are not applied on mobile operating systems/devices. Therefore, the processing time results are generally better due to the more powerful hardware they use. Although they are not directly comparable with the proposed method, since they are also useful studies in real-time implementation context, their results are also included in this study and benchmark table for introducing the current level of real-time dehazing on mobile based operating systems compared to non-mobile systems.
As it is explained in detail in the previous sections, the proposed method gets use of the reality that atmospheric light is not very variant in a specific time interval. Therefore, measuring the current variance of it at the time of dehazing and specifying the threshold rotation angle is the novelty of our method. The proposed method reads the orientation sensor of the mobile device and decide to recalculate the atmospheric light or not. Once the rotation angle exceeds the optimal threshold rotation angle, the atmospheric light is recalculated. Otherwise, the previously calculated atmospheric light is used for dehazing. Since the most time-consuming part in dehazing is the estimation of atmospheric light, the proposed approach is successful in terms of accelerating the dehazing process. From Table 8 it can be observed that the proposed method is more successful among the other studies which implement dehazing on mobile operating systems. It is better than the study in [29] both for CPU case and CPU and GPU together case.

Conclusion and future work
In this study, a new image dehazing method which is based on measuring the change of the scene by reading the device orientation sensor in real time and a mechanism to recalculate the atmospheric light is implemented. Since, the change of the scene has not high variance in many real-time dehazing applications, this study gets use of defining some reconstruction error toleration for dehazed image. This study proves that it is possible to handle high quality dehazed images by skipping the calculation of the atmospheric light for some time instant up to exceeding a pre-defined orientation threshold. Since the most time-consuming part of image dehazing is atmospheric light calculation, proposed approach accelerates the overall process and reduce the processing time for each frame. This enables to dehaze in real time. By keeping the visual quality of the reconstructed image, promising image processing time results are achieved despite limited power of hardware and only CPU is used. The results are superior or on par with the other state-of-theart real-time dehazing applications. Processing time results show that proposed method can be applied in real time on the devices which have android operating system. If the system is empowered in terms of hardware specifications, then the processing time will decrease dramatically. The future work should be based on using GPU and/or CPU and GPU together. On the other hand, more powerful hardware devices should be used. Furthermore, similar implementation should be done on IOS devices. Another important point is that transmission maps may be estimated using stereo imaging which enables more accurate estimation of the depth maps.
The next work will be based on using deep learning models and deploying the model on Android devices. This will most probably increase the visual quality besides increasing the processing speed.