Vehicular Electronic Image Stabilization System Based on a Gasoline Model Car Platform

Noise, vibration and harshness (NVH) problems in vehicle engineering are always challenging in both traditional vehicles and intelligent vehicles. Although high accuracy manufacturing, modern structural roads and advanced suspension technology have already significantly reduced NVH problems and their impacts; off-road condition, obstacles and extreme operating condition could still trigger NVH problems unexpectedly. This paper proposes a vehicular electronic image stabilization (EIS) system to solve the vibration problem of the camera and ensure the environment perceptive function of vehicles. Firstly, feature point detection and matching based on an oriented FAST and rotated BRIEF (ORB) algorithm are implemented to match images in the process of EIS. Furthermore, a novel improved random sampling consensus algorithm (i-RANSAC) is proposed to eliminate mismatched feature points and increase the matching accuracy significantly. And an adaptive Kalman filter (AKF) is applied to improve the adaptability of the vehicular EIS. Finally, an experimental platform based on a gasoline model car was established to validate its performance. The experimental results show that the proposed EIS system can satisfy vehicular performance requirements even under off-road condition with obvious obstacles.


Introduction
NVH problems are increasingly important issues in the automobile industry, for implications on environmental noise pollution, comfort perceived by passengers and vehicle performance. Although high accuracy manufacturing, modern structural roads and advanced suspension technology have already significantly reduced NVH problems and their impacts, off-road condition, obstacles and extreme operating condition could still trigger NVH problems unexpectedly. Specific to the visual environment perception function of a running vehicle, the inevitable bumping and shaking induce jitter of the image sequences captured by the vehicular camera, which goes against the subsequent observation and interpretation of information in the images. The induced jitter can be mitigated by introducing a mechanical damping structure or EIS. Based on the image processing method that costs less than additional mechanical structures, EIS has become a common solution. Dated back to the 1980s, Jean et al. [1] developed an EIS system for the reconnaissance vehicle at the resolution of 640*480. At present, EIS has been widely used in the industry. The real-time EIS technology launched by AMD in 2017 can process online video in real-time and can be compatible with any rendering mode [2]. Huawei combined artificial intelligence algorithm with EIS and named it AIS [3], which first appeared on its P20 series mobile phones. With this technology, the viewfinder frame can be static under the premise of small vibration amplitude, allowing for multiframe synthesis. Most of the achievements in the industry focus on improving the whole system, while most of the attention in academia is on enhancing the performance of a certain part of the EIS system.
Ignoring the preprocessing such as graying in EIS system, EIS mainly consists of three parts: ① estimating the image transformation matrix of the current frame with respect to the reference frame; ② filtering the state variables derived by transformation matrix; ③ inverse compensation and output of the current frame [4].
Estimation of the image transformation matrix is the most important step. When calculating the transformation matrix of the current frame with respect to the reference frame, the relevant parameters are often used to derive a vector known as global motion vector. The process of calculating global motion vector is motion estimation. Motion estimation methods mainly include block matching method, gray projection method, phase correlation matching method, bit plane matching method, gray projection method, feature matching method, optical flow method and so on [5]. Block matching method and feature matching method are considered to have higher matching accuracy. However, when using blockmatching method, a large matching block search area is needed to prevent the search results from falling into the local optimum, which sacrifices the image processing speed. Therefore, this paper focuses on EIS based on feature matching method. Image features include point feature, line feature, edge feature and so on [6]. Point feature has become a widely used feature description method because of its easy subsequent matching process. The Harris feature point detection algorithm was first proposed by Harris et al. [7] in 1988. It performs convolution calculation on the image through the derivative of the Gaussian function. Harris algorithm is relatively stable in dealing with rotation and brightness changes, but it does not have scale invariance. Lowe [8] proposed scale-invariant feature transform (SIFT), which has excellent scale invariance and has been widely used in related fields. Based on the SIFT, Bay et al. [9] proposed the speeded up robust features algorithm (SURF). This algorithm uses a fast Hessian matrix to detect feature points, and uses an integral image method to reduce the calculation time. In this way, the efficiency of the algorithm is improved greatly. Features from accelerated segment test (FAST) algorithm was proposed by Edward et al. [10] in 2006. FAST determines feature points by detecting the pixel values around the image. These four feature point extraction algorithms are the most widely used methods. Based on these four methods, various EIS algorithms have been developed [11][12][13][14].
The process of filtering the state variables derived from the transformation matrix is motion filtering. Its purpose is to distinguish the subjective motion from the non-subjective jitter, so as to compensate for the nonsubjective jitter in the subsequent image processing. Commonly used filter algorithms in EIS include mean filter, least square fitting filter, B-spline curve fitting filter, Kalman filter (KF). Mean filtering and least square filtering cannot be operated in real-time. The method based on B-spline curve relies on kinematic model [15]. The method based on KF has become the mainstream method in various pieces of research [16][17][18]. However, the effect of KF is sensitive to the noise parameter settings of system, the frequency and amplitude of random motion, etc. [19]. Researchers have done a lot of work on these problems. Park et al. [20] proposed a new image stabilization method based on finite impulse response filter, which is more robust against mistuning on the model parameters than the KF. Choi et al. [21] used extend Kalman filter in aerial airborne imaging to remove the jitter of the camera and retain scanning motion. Yang et al. [22] proposed a novel stabilization algorithm based on particle filter in EIS. Zhu et al. [23] made further improvements based on his research. Besides, Ioannidis et al. [24] proposed the basic features of the Hilbert-Huang transform in order to separate the local motion signals, which is a novel method in the field of EIS.
The last step of EIS is to compensate and output the current frame inversely. After the image is inversely compensated according to the filtered global motion vector, there is a blank part in the image. The blank part needs to be cut before output of the current frame. Although this step is an indispensable step in the EIS system, due to its relatively low importance, this paper does not review it too much. From the above studies, the following research trends and shortcomings can be summarized: ① the existing EIS systems are mainly used for handheld recording devices and are rarely constructed from the perspective of vehicles; ② the filtering has a great impact on the performance of EIS. Irregular road excitation is easy to lead to filter divergence, especially for the KF with fixed system noise, which is rarely mentioned in the existing studies; ③ the real-time performance of EIS has become a concern of many researchers, especially in the feature detection and filtering. However, the real-time performance of the matching process has not been paid close attention to.
The image sequence captured by the vehicle camera vibrates due to vehicle vibration or harsh road conditions. This paper mainly studies the use of EIS technology to solve inter-frame blur, a form of video blur. In Section 2, the image stabilization technology based on feature point detection and matching is selected, and ORB algorithm is used to meet the real-time requirements. Furthermore, in order to enhance the instantaneity and accuracy of the elimination of mismatched point pairs, the random sampling consistency algorithm (RANSAC) is improved in Section 3. In Section 4, aiming to adapt to various dithering conditions in-vehicle scene, this paper adopts AKF algorithm to solve the problem that the classical KF is sensitive to initial values. To verify the performance of EIS, a gasoline model vehicle with remarkable vibration characteristics is refitted for experiment in Section 5. After the process of the EIS proposed in this paper, the average peak signal to noise ratio (PSNR) of the video is improved by 1.26 dB as shown in Section 6, which proves the proposed EIS system can satisfy vehicular performance requirements even under extreme conditions. Finally, conclusions are provided in Section 7.

Selection of Image Matching Algorithm
Image matching is the first and the most important step in estimating image transformation matrix, for its implications on the accuracy and instantaneity of the derivation of global motion vectors. At present, a large number of algorithms for image matching have been proposed, including block matching algorithm, gray projection algorithm, optical flow algorithm and feature matching algorithm. Optical flow method and gray projection method cannot adapt to complex scene changes. Block matching algorithm and feature matching method are better choices for the vehicle demands. For block matching algorithm, to satisfy the unique needs of the vehicle, a large search area for the matching blocks is required. Otherwise, the search results may fall into the local optimum easily. However, the search area size for the matching blocks positively correlates with search time. Under the real-time requirements of the vehicle, the accuracy of the block matching algorithm is limited. Based on the above discussion, feature matching algorithm is the most suitable for vehicle demands. It has been mentioned in Section 1 that Harris, SIFT, SURF and ORB are the most widely used feature point extraction algorithms. The results for the processing time of one frame at the resolution of 480*480 pixels using these algorithms are shown as Table 1.
ORB algorithm has significant advantages in processing time. Besides, the number of extracted points is about the same as that of other algorithms.

ORB Algorithm Principle
ORB algorithm, used to detect and describe the feature points, is the combination of FAST and improved binary robust independent elementary features algorithm (BRIEF) [25].
FAST algorithm finds key points in the image, such as corner points. Generally, feature points possess the characteristic of sharply varying pixel values among the surrounding pixels. As shown in Figure 1, by comparing the gray value of point P with the gray values of 16 surrounding points, whether P is a corner point is determined.
The output of FAST corner detection algorithm is the coordinates of corner points. In order to match the corners detected in the current frame and the reference frame, it is necessary to determine a descriptor to describe the nature of the corner points. ORB algorithm uses BRIEF algorithm to describe feature points, and BRIEF algorithm utilizes a feature descriptor of binary  string. n pairs of pixels p i , q i (i = 1, 2, ..., n) are selected in the neighborhood of a feature point P. Generally, n is 128, 256 or 512, which is set to 256 in this paper. The size of the neighborhood is S × S . p i and q i obey the Gaussian distribution of N (0, S 2 /25) . Then the gray values of each point pair are compared. If I(p i ) > I(q i ) , the i th bit in the binary string is 1, otherwise it is 0, i.e. [25] where X denotes the feature point detected; Y denotes the points to be compared. I denotes the gray value of the point. i represents the ith bit in the BRIEF descriptor. By connecting the bits of the N pixels, a bit string is obtained. To solve the problem that the BRIEF does not define the main direction, BRIEF is improved by gray centroid method.
where x , y are the pixel coordinates in the neighborhood around the feature point. The feature of grayscale centroid can be determined by Therefore, ORB has rotation invariance, which is essential for the demands of on-board working condition.

Results of Feature Points Detection and Matching
It is found that the average processing time of each frame is 6.01 ms, and the average number of detected

Image Transformation Matrix
The essence of judging whether a point pair is mismatched is to judge whether this point pair obeys the image transformation matrix H . Therefore, H needs to be determined before the mismatched points are eliminated. An appropriate motion model, which calculates the global motion vector, is essential for a good image stabilization effect. Although this kind of motion model desires high calculation capability, a relatively complex model is still necessary to describe the motion captured by the image. The motion model adopted should be able to adapt to possible working conditions of vehicle body shaking with large scale. The 4-parameter similarity transformation model is used to describe the rotation and translation motion with good prediction accuracy provided and takes the form where (x 1 , y 1 ) and (x 2 , y 2 ) represent the coordinates of the reference frame and the current frame, respectively. �θ represents the roll angle between two frames. x and y denote the lateral displacement and the vertical displacement of the current frame with respect to the reference frame, respectively. s represents the scaling factor.

Improved RANSAC Algorithm
The image transformation matrix of the current frame with respect to the reference frame can be derived by fitting method like least squares fitting, utilizing the matched feature points. However, the accuracy of transformation matrix is affected by the mismatched feature points. The RANSAC algorithm divides all points into two types: inliers and outliers. Inliers refer to points which can satisfy the model, while the outliers refer to the interference points. In this way, it prevents the calculation results from being affected by outliers.
The specific implementation process of RANSAC algorithm is shown as follows [26].
Step 1: Set the minimum number of point pairs s which can be used to derive H between two frames. Select s pairs of points without repeating to form a point set S r .
Step 2: Set the number of iterations k . Suppose the amount of point pairs is n , and the amount of inliers is m . Obviously, the probability of all points in set S r being inliers is Within k iterations, the probability of at least one S r only containing inliers is p . k and p satisfy Then k can be derived from the inequality above Step 3: Determine the number of inliers that satisfies H , with the judgment criteria given by where p i ′ and p i represent the coordinates of the current frame and the reference frame, respectively. e refers to the error threshold to distinguish inliers and outliers. The total number of liners is counted as M.
Step 4: The transformation matrix H corresponding to the maximum M is the best matrix to be found.
Although RANSAC algorithm has certain robustness, few defects exist in practical engineering: ① If the matching accuracy is not high enough, a large number of outliers lead to an increase in the number of iterations. ② If the random points are too concentrated, the transformation matrix's accuracy is seriously affected. ③ If the selected feature points contain outliners, the entire iteration is also performed once, which significantly wastes calculation time.
The RANSAC algorithm is improved in this paper, in consideration of the shortcomings above. The specific implementation steps of i-RANSAC are as follows.
Step 1: Rank the point pairs by Hamming distance D H . Remove this pair of points, if where E denotes the mean Hamming distance of point pairs and σ denotes their variance. K can be used to adjust the amount of removed points.
Step 2: Set the minimum number of point pairs s which can be used to derive H . Select s pairs of points without repeating in different grids to form a set S r .
Step 3: Calculate the number of iterations.
Step 4: Select 3 pairs of points. Determine the amount of inliers. If the amount is less than 2, jump out of this iteration and proceed to the next iteration.
Step 5: The transformation matrix H corresponding to the maximum M is the best matrix.

Results of Eliminating Mismatched Feature Points
The image transformation matrix between two frames should be an identity matrix when the vehicle is static. However, the running cars, pedestrians or even swinging leaves exert some mismatch points, which changes the transformation matrix from identity matrix. Exploiting this feature, the performance of i-RANSAC can be detected (Figure 3).
We propose a novel definition, feature points matching accuracy (FPMA) of a specific segment of a processed video, to assess the performance of the mismatched points elimination algorithms. Denoted by A m , FPMA is defined as Eq. (10).
where k max denotes the total number of frames of the processed video segment.
Obviously, y should be 0 theoretically. It indicates that the A m closer to 0 corresponds to an algorithm with better performance. Since the averages of FPMA after the process of RANSAC and i-RANSAC are 0.068 and 0.020 respectively, the improved RANSAC proposed in this paper has better performance. In addition, the average of processed frames per second of i-RANSAC increases by 32.4% compared with that of RANSAC, which demonstrates better real-time performance of i-RANSAC.

Kalman Filter
The purpose of filtering the image transformation matrix is to distinguish the subjective motion from the non-subjective jitter, so as to compensate the non-subjective jitter in the subsequent image processing. Commonly used filters in EIS include mean filtering, least-square fitting filtering, B-spline curve fitting filtering, and KF. Mean filtering and least-squares filtering require image observation states from multiple frames. Therefore, their implementation has a lag, making it difficult to meet vehicle requirements. The method based on the B-spline curve relies on the kinematics model. The method based on KF has become the mainstream method in this research field. Both process noise variance Q and observation noise variance R need to be set in advance in classical KF. As shown in Figure 4, the filter effect is completely different when the noise variance settings are Q = 0.01, R = 0.1 and Q = 0.1, R = 0.1 , respectively. Therefore, considering vehicle conditions, the fixed noise variance matrix cannot adapt to various vibration conditions, especially concerning extreme dynamics.

Adaptive Kalman Filter
AKF is applied to improve the adaptability of vehicular EIS. Correcting the parameters of the model and the noise covariance in real-time, AKF reduces the influence of model error during the prediction of state variables. This paper mainly introduces the Sage-Husa AKF algorithm [27] into the proposed EIS and makes certain improvements.
Generally, the process noise variance Q and the observation noise variance R vary widely in operation and cannot be determined accurately. If the pre-defined Q and R are less than the actual noise variance, the resulting small uncertainty range of the true value leads to biased estimation and filtering divergence. Conversely, if the pre-defined Q and R are larger than the actual noise variance, the state estimation error increases and the filtering divergence is caused statistically. Therefore, the construction of AKF and online adjustment of Q and R are of great significance in improving the accuracy and stability of the filter. On this basis, the forgetting factor is introduced to endow Sage-Husa AKF with the ability to estimate the unknown timevarying noise in real-time. Using the measurement data for recursive filtering, the adaptive filtering algorithm estimates and corrects the statistical characteristics of process noise and measurement noise in real-time. Sage-Husa AKF method is simple in principle and good in real-time, so it has been widely used in engineering fields.
In KF, the state equation and observation equation are given by (11) X k = ∅ k,k−1 X k−1 + W k−1 , where X k is the state vector, and ∅ k,k−1 is the state transition matrix. Z k is the observation vector. H k is the observation matrix. W k−1 and V k denote the system noise and the observation noise, respectively. W k−1 obeys N 0, Q k−1 distribution and V k obeys N (0, R k ).
With the state vector X k = [x,ẋ, y,ẏ, θ,θ] T , x , y denote the lateral and vertical displacements of the current frame concerning the reference frame, respectively. θ denotes the roll angle of the current frame with respect to the reference frame. ∅ k,k−1 and W k−1 are given by where x,y and θ are predicted states.
The observation state vector is Z k = [x, y, θ] T . H k and V k are given by In AKF, the averages of observation noise and prediction noise are not considered as 0, but as q k and r k . The superscript is adopted to distinguish the prediction state from the observation state. Then, the state Eq. (11) and observation Eq. (12) are given by where E q k−1 = q k , E r k = r k . The other parameters correspond with the Eqs. (11) and (12). Then, the processes of AKF are given as follows: Step 1: Calculate one-step prediction state X k,k−1 and noise covariance matrix P k,k−1 .  Step 2: Update filter gain K k : Step 3: Calculate the residual ε k Step 4: Update the state vector and the noise covariance matrix Step 5: Calculate the weigh factor d k In Eq. (23), b is the relaxation factor, b ∈ (0, 1).
Considering that the measurement accuracy has a positive correlation with the number of points, R k is updated as where r is a parameter to indicate the influence of the number of points on the measurement accuracy.
The larger r indicates the greater impact of the number of points on measurement accuracy. Through this method, this paper improves AKF under the scenarios of vehicular EIS system. As shown in Figure 5, AKF retains the subjective motion vector and filters out vibration. More importantly, (25) (28) the initial noise setting has almost no effect on the filtering results, which means that AKF has the ability to adapt to various types of system noise.

Framework of the Model Car
In order to verify the adaptability of the developed EIS system to various conditions, an experimental platform able to provide extreme scenarios is desired. Accordingly, a gasoline model car possessing abundant NVH characteristics was established, with which off-road condition, obstacles and extreme operating condition can be easily implemented. The framework of the gasoline model car is shown in Figure 6. The model car is controlled by two control servos, namely the steering servo and the throttle servo. The control algorithm is programmed in the STM32 microcomputer placed in the front of the model car. An encoder is installed under the chassis to collect the speed signals. Figure 7 shows the EE architecture of the platform. Concerning the computing platform, the upper computer uses Raspberry Pi 3B+ for capturing video. The photo sensitive chip of the camera used in this experiment is Sony IMX219, which is a CMOS chip. The camera captures video at the resolution of 480*480. The STM32 microcomputer is used to control the model gasoline car through two PWM signals. Due to the load limitation of the model car, an offline calculation method is adopted for EIS. The calculation platform is a quad-core CPU, and the basic frequency is 3.2 GHz.

Experimental Road Conditions
The platform includes an encoder to collect speed signals, and it is difficult for the encoder to work stably under harsh road conditions. Therefore, we installed bumps on both front wheels during the experiment to simulate the high unevenness road, which is shown in Figure 8 Figure 9 shows the comparison of PSNR between the original video and processed video. The average PSNR increases by 1.26 dB, which proves the positive image stabilization effect of the proposed EIS system. It is worth noting that, a 1.20 dB increment seems to be less than the results in other papers. Actually, it is meaningless to compare PSNR of different EIS, because if the filter parameters are set to make the filter results smooth enough, PSNR can be significantly improved. However, if doing so, the purpose of preserving the subjective motion vector of the car is lost.

Conclusions
NVH-related problems induced by off-road condition, obstacles and extreme operating condition still deserve more attention in modern intelligent vehicles. They deteriorate not only the driving performance but also the function of advanced driver assistance system (ADAS) in vehicles. For instance, different levels of jitter in the image sequences captured by a vehicular camera might happen and affect subsequent observation and interpretation of information in the images. Aiming at this problem, a vehicular EIS system based on a gasoline model car platform is proposed. The gasoline model car exhibits abundant NVH characteristics, and is consequently a very convenient and appropriate experimental platform with which off-road condition, obstacles and extreme operating condition can be easily implemented. The conclusions on the proposed vehicular EIS and corresponding experimental results are summarized as follows.
(1) Feature point detection and matching based on an ORB algorithm are implemented to match images in the process of EIS. It shows that the average processing time of each frame is 6.0 ms. The average amounts of the detected points and the matched feature point pairs in each frame are 483.7 and 223.4 respectively. It is proved that the ORB algorithm satisfies the real-time processing requirements for vehicular application. (2) A new i-RANSAC algorithm is proposed to eliminate the mismatched feature points, and improve its instantaneity and accuracy under certain circumstances. And a novel definition, FPMA, is proposed to quantify the performance of the algorithm proposed in a video. The i-RANSAC algorithm shows a significantly improved performance from the FPMA. Besides, the average of frames processed per second in the i-RANSAC algorithm has increased by 32.4% compared with that number in the traditional RANSAC algorithm. (3) A Sage-Husa-based AKF is applied to improve the adaptability of the vehicular EIS. Considering that the measurement accuracy has a positive correlation with the number of feature points, the update of observation noise variance is improved. The average PSNR of the video processed with the EIS system has increased by 1.26 dB compared with that of the original video. The results show that the proposed EIS system can satisfy vehicular performance requirements even under off-road condition with obvious obstacles.