Leaf surface roughness measure based on shape from focus

Background: Surface roughness has a significant effect on leaf wettability, consequently influencing the efficiency and effectiveness of pesticide spraying application. Therefore, surface roughness measure of plant leaves is conducive to relevant researches. In order to characterize the surface roughness, present methods have to draw support from large apparatus, but they are generally high-cost and not portable enough for field measurement. Methods those instruments even have potentially inherent drawback such as absence of relation between pixel intensity and corresponding height for scanning electron microscope (SEM).


Background
Improving the efficiency of pesticide is a classical issue in agricultural engineering due to concerns on environment, resource and costs. Notwithstanding divergence of definition [1], leaf wettability is the manifestation of submicron physicochemical interactions between the leaf surface and the droplet solution [2], i.e. the affinity of leaves to water or medicine. Extensive studies on leaf wettability aim to enhance the adhesion of droplets on the surface of leaves, averting the off-target deposits (e.g. rebound, roll, slide) resulting from adhesive characteristics of leaf surface. Leaf wettability may have significant difference in species, varieties, and even within a life cycle [3]. Plant leaves are rarely absolutely flat when observed at a high resolution. Previous researches found it was the chemical composite and microstructure of epicuticular wax formations on leaves that determined the leaf wettability, and generally, increasing surface roughness of a hydrophobic surface increased the 1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63  64  65 from 10 µm to 100 µm by 10 µm. The hybrid indica rice (Oryza sativa L.) grew from seed in an incubator whose leaves were sampled at seedling stage. The Rape (Brassica chinensis L.) was Zheda 618, breed by Zhejiang University. The cotton (Gossypium hirsutum L.) was uplond TM-1. The tobacco (Nicotiana tabacum L.) was benthamiana.
SFF imaging system ( Figure 1) built in Agricultural Information Technology Institute of Zhejiang University (Hangzhou, PRC) to capture focused and defocused images. It comprised an optical experimental platform (Fenghua Technology Co., Ltd, Shenzhen, PRC), an industrial RGB camera (Tuoriweiye Technology Co., Ltd, Shenzhen, PRC), a portable microscope (Meijing Electronics Co., Ltd, Shanghai, PRC), and a micromotion objective table (Hengyang Electronic Technology Co., Ltd, Guangzhou, PRC). The resolution of the RGB camera was 3072 pixels × 2048 pixels. The portable microscope was capable of magnification of ×10, functioning as a camera lens. The micromotion objective table shifted leaf vertically in motion range of 12.5 mm and accuracy of 0.5 µm, screwed on the optical experimental platform. Manual adjustment to ratchet knob would lift the table in precision to create defocus. The camera was mounted on the portable microscope, upside down towards the table and images were transmitted to computer via Ethernet.
Wyko NT9100 (Veeco, NY, US) optical profiler utilized to measure of leaf roughness as a comparison. This instrument had a vertical scanning range of 0.1-10 nm, skilled in measuring 3d surface topography. The measurement was made in College of Optical Science and Engineering, Zhejiang University (Hangzhou, PRC), where the optical profiler was located. Immediately the defocused images were captured, leaves were move to optical profiler for observation. Benefited from the supporting software, we could facilely reconstruct the three-dimensional structure and obtain the surface roughness.
Shape from focus Shape from focus, an identical name of depth from focus (DFF), is hardly a spannew notion, but a classic involved problem in computer vision. SFF is widely studied as a passive method of estimating depth or shape from monocular focal cues [16,17,18,19].
With a setting of focus, difference in blurriness for objects along z-axis can be observed. It is one of the most obvious cues for human observers to understand depth in a two-dimensional image. As to the microscopic level, the radiation of each point on the object spread onto a fuzzy circle, which is usually described by point spread functions (PSFs), when projecting to the sensor plane ( Figure 2). In particular, if the focus locates on the sensor plane, the diameter of the fuzzy circle is infinitely close to zero. In this case, the projection of point M is a point since the sensor plane and the focal plane are coincident. For a constant focal length f and the fixed location of lens and sensor plane, with the objects moving away from the lens, the focus of the point M moves towards the lens and a fuzzy circle with a diameter of c is synchronously detected on the sensor plane.
Some methods, known as depth from defocus (DFD) or shape from defocus (SFD) methods, aim to compute the distance between the defocused object points and the sensor plane by estimating the diameter of fuzzy circles. These methods are fast but inaccurate, with necessity of prior known intrinsic and extrinsic parameters as well. According to SFF theories, a focused image surface (FIS) is defined as the surface formed by a set of points at which the object points are focused by the lens [20]. It is determined in basis of sharpness at each pixel in a sequence of images with continuously varied focus level, where a focus measure function or operator is applied to compute the sharpness. Generally, the FIS comprises pixels where the corresponding frame gives the maximal sharpness among the images, in spite of various optimized techniques such as quadratic or Gaussian interpolation. With the FIS available, we can retrieve all the corresponding points on the object surface.

Image registration
Image registration targets on aligning two or more images of the same scene by warping and overlapping them geometrically. Work schemes in most SFF researches ignores the differences in view-of-field during the adjustment of focus level since the image registration results in a considerable cost. However, we propose to have the pixel offset corrected, taking into account the trade-off between the accuracy requirement and the computational complexity in roughness measure of leaf surface. The process of image registration is described as follows.
First, features are detected using Speed-Up Robust Feature (SURF) [21]. SURF has extensive applications in image registration for merits of rapidity and robustness. The strategies on Gaussian pyramid and Harr wavelet make the SURF descriptor scale-and rotation-invariant. SURF performs superiorly in vast image registration tasks compared with other present-day methods. Thus, we propose to use SURF as the feature detector in this scheme. For each image, SURF algorithm yields dozens of features expressed by 128-deminsional vectors.
Second, features with the images are matched based on the correspondence between the features within pairs of images. The so-called correspondence is a function, which characterizes the spatial relationship, of two features. As a frequently used distance, Euclidean distance is adopted in our scheme. By calculating the distance of every pair of features between neighbor frames, we can obtain the brute-force matches. The amount of matches indicates the clarity of the images to some extents. Points on images with few matches are largely defocused, which do not make sense for SFF. Therefore, for the sake of arithmetical simplification in subsequent procedure, the images without enough matches are removed from the dataset, where a threshold η is set according to the texture richness of leaf surface.
Then, homography matrices are estimated from the matches. Theoretically, the distortion of images resulting from the adjustment of focal level is no other than scaling, but in practice, it is observable that miscellaneous errors may incur the other forms of distortion. The homography matrix denotes the mapping relation between two image coordinates, which is a 3 × 3 matrix and independent of scalar multiplication. Thus, we have 8 parameters to estimate for each homography matrix, i.e. 4 matches are in need at least for a image to be transformed. In addition, Random Sample Consensus (RANSAC) works as a filter of matches in consideration of high sensitivity of homography to noise.
Eventually, the homography matrices are applied to the images for affine transformation. In this phase, all of the images remained (meeting the limit of the aforementioned threshold η) are transformed to the same image coordinate, e.g. that of   1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63  64  65 the first image. Linear interpolation is used to deal with non-integer pixel indices due to spatial discreteness of digital images. After the affine transformation, the pixel offsets among the image frames are eliminated so that an overlapping region emerges. For each image frame, the edge outside the overlapping region should be trimmed off, ensuring that there is no absent pixel.

Focus measure
Focus measure plays a dominant role in SFF since it produces the fundamental information for FIS. A great number of focus measure operators have been proposed in previous researches. Pertuz [17] summarized these focus measure operators by grouping them into 6 families. A noteworthy conclusion is that operators in Laplacian family has best overall performance at normal imaging conditions in spite of difficulties in determining the family who performs best under any imaging conditions. Some operators working in frequency domain also performs well, which is widely used in autofocus; nevertheless, time-consuming Fourier transformation is not in favor of SFF problems since focus measure is determined in local windows.
In this paper, we compared the performance of four spatial focus measure operators.

Energy of Laplacian (EL)
Energy of Laplacian considers the second derivative of image. It is calculated by summing the pixel intensity of an image convolved with a Laplacian mask. The sum is done within a local window for denoising purpose as: where L(·) denotes Laplacian transformation, and f (·) denotes the pixel intensity of image, and Ω(·) denotes the adjacent region (the same below).

Sum-modified Laplacian (SML)
Laplacian operator calculates the second partial derivatives of image with respect to x and y, which can be either positive or negative. Therefore, Energy of Laplacian may give a small response at pixels where the two partial derivatives at orthogonal directions cancel out. Different from Energy of Laplacian, Sum-modified Laplacian sums the energy of a window for an image convolving with a modified Laplacian (ML) operator, which is defined as the sum of absolute value of second partial derivatives: where L m (·) denotes modified Laplacian transformation. Tenenbaum gradient (TG) Tenenbaum [22] proposed a Sobel operator based focus measure operator, which is named Tenenbaum gradient. Sobel operator is well known as an edge detector. It has two forms which produce first derivative of image with respect to x and y, respectively. By summing the length of gradient within a window, Tenenbaum gradient is obtained as: Gray level variance (GLV) Gray level variance is a statistic operator, based on the assumption that point at FIS has the maximal gray level variance. Definitely, it is also calculated in local windows: where µ denotes the mean pixel intensity of the window.

FIS searching
Principally, the most straightforward way to determine the FIS is searching for the maxima of focus measure at each pixel [23]. It is fast and simple to implement, but the resolution of FIS depends on the movement interval of object with respect to the image detector, i.e. the reconstructed surface looks discrete. A great many of SFF researches inclines to interpolate the points nearby the maximum, where useful interpolation techniques including quadratic [24] and Gaussian [25] interpolation are intensively adopted. Interpolation methods aim at smoothing the FIS, but, as well as the former approach, a maxima caused by annoying noise may distort it. Besides, interpolation methods are time-consuming though there are closed form solutions for particular situations. Some optimized the FIS using tensor voting [26,27] but the algorithm iterates every token from depth cloud and computes the eigenvalues, not favoring the problem at hand. In this paper, we proposed a simple but efficient and robust method for FIS searching using weighted average of stereo blurred focus measure.
We have obtained the focus measure values for each pixel after applying any of the aforementioned focus measure operators according to our workflow. Considering the potentially abnormal values because of noise, we blur the original focus measure values with a stereo mean filter.
The blurred focus measure is defined as: where N denotes the number of pixels in the window, and k denotes the image frame of the focus measure, and FM z (x,y) denotes the focus measure at pixel (x, y) for image frame z.
In order to search the FIS fast and precisely, we are supposed to make full use of the focus measure information without increasing difficulty of computation. For pixels in different image frames at the same pixel coordinate, the focus measure values are sorted from largest to smallest. Then, the FIS is calculated by formula: where ϕ m denotes the first m frame numbers of sorted focus measure values, and d(·) maps the frame number to corresponding motional amount of micromotion table, and σ d denotes the standard deviation of d(α), and ε is a manual setting threshold, and NaN (Not a Number) represents an invalid value. The Formula (9) indicates that FIS is determined by weighted average of d, where blurred focus measure functions as the weight. The method proposed considers the score of image frame where FIS probably is and yields the comprehensively predicted location. Large standard deviation of d(α) implies unreliability of focus measure. Thus, fixing invalid values to FIS on these pixels is an approach to prevent them from involved in following computation. Figure 3 shows the performance of the proposed FIS searching method on cotton leaf surface. It is evident that this method is less sensitive to maximum compared with interpolation methods.
The SFF procedures in most applications used to be finished as long as the FIS was found, and the subsequent measurement of surface feature parameters was taken for granted. In practice, an abnormal bulge emerged from the central region of the reconstructed surface. Drift of FIS is potentially ubiquitous in SFF because of the defects on fabrication of camera, assembly accuracy of microscope and discrepancies on sensitivity of sensing units. The effect is usually too slight to pay attention to, but it gets prominent in microscopic applications. To eliminate the distortion, the difference of the primary FIS and a drift field substitutes the primary FIS, where the drift field denotes the strongly blurred FIS of a flat surface. This work will not be cumbersome duplicates, as the measure of the drift field, i.e., system calibration, is done once for all as long as the system is established. Figure 4 demonstrates the effective correction of a flat surface with scratches of aluminum alloy. An apparently abnormal uplift of FIS is restrained in the undistorted image.

Results and discussion
Surface render comparisons on leaves Crop plant leaves of various species were observed in both optical profiler and the SFF system. Figure 5 shows the surface rendering comparison of these two methods, where the height of surface gets higher as the color ranges from cool tone to warm tone and SML operator is chosen for focus measure in SFF renderings. It should be noted that it is hard to exhibit the identical region of interest both in two methods due to difference in field of view, so we have to make a qualitative evaluation according to their comprehensive performance comparison.  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63  64  65 Plant leaves ordinarily present fractal structure. As is shown in Figure 5 (a, b), rice leaf surface comprises complex microstructures including parallel wavy veins, hairy trichomes, quasi-1D arranged micropapillae and patchy protuberances. An inherent drawback for SSF is that it performs mediocrely in details and abrupt changes due to local windows remarkably for focus measure calculation. Those details are comparatively suppressed in the SFF rendering; nevertheless, it still preserves some detailed information. The SFF rendering demonstrates a distinctly higher altitude where the trichomes and protuberances are visually supposed to be. Relatively, the veins are well defined in SFF reconstruction since larger scale of scenes are less sensitive to local windows.
It is evident that the field of view of optical profiler is much smaller than that of the SFF system. Optical profiler does not render more than one vein at an observation, while the SFF system offers a rendering of several veins, taking the advantage of a wider field of view. Since the wavy veins on rice leaves are considerably different in size, the accuracy of roughness measure for optical profiler are largely dependent on the selection of single vein. By contrast, such dependence abates for the SFF system. Figure 6 shows the computer renderings of rape, tobacco, cotton and rice leaf surface. Dicotyledons are characterized by reticular veins. Leaf surface trends to be flat at where the veins are sparse, which is well demonstrated in the renderings. Table 1 shows the areal surface roughness of plant leaf surface displayed in Figure  6, where the arithmetical mean height based areal surface roughness is calculated by:

Quantitative analysis on gauge blocks
It is a knotty problem to measure identical micro-surface profile on different instruments. In order to quantify the performance of the SFF system for each FIS method, the experiments were conducted by pairwise combining the gauge blocks along their edges, creating a step at the shared border. Figure 7 shows the diagram of experimental measure where the microscope aimed at the seam of the blocks. The pairwise combination of gauge blocks yields 10 measurements at different step height: 10-100 µm at intervals of 10 µm. For each measurement, we took 5 approximately 10, 000 samples from the FIS at both step faces respectively and calculated the height of the step. The number of samples depends on the alignment and NaN values since we sampled in columns. RMSE (Root Mean Square Error) and Pearson correlation were computed using ground truth and reconstructed depth map. Smaller RMSE and larger correlation indicate a better performance. If f (m, n) and h(m, n) are computed and actual step height of m-th sample of n-th measurement respectively, the RMSE and the Pearson correlation r are given by 1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63 64 65 (12) Figure 8 shows the comparison on RMSE and Pearson correlation of our proposed SFF method and traditional SFF method with different focus measure operators. It is evident that the proposed SFF method presents a superior performance for its smaller RMSE and larger correlation. We believe that the amelioration on distortion contributes to the significant improvement on performance. Moreover, system robustness benefits from the introduction of NaN value and design of weighted average in focus measure. For our proposed SFF method, the operator SML shows an outstanding performance compared with other operators since the RMSE of SML is lower than 4.44 µm. The correlations of four operators are closed to 1, indicating that the estimated depth is well correlated to ground truth, though the operator GLV shows a relatively smaller correlation.
Friedman test and post-hoc Nemenyi [28] test were adopted as a further compared evaluation of the focus measure operators. Friedman test is a non-parametric hypothesis test. If r i is the average rank of i-th algorithm, and k is the number of algorithms, and N is number of datasets, the statistic is subject to F distribution, where Here, it ranks the performance of the four operators for each dataset, and calculates the average rank of each operator. Table 2 shows the rank table of four operators, and unsurprisingly, the operator SML ranks first in most of datasets. According to Formula (13) (14), we can calculate the statistic τ F = 26.714. Taking the significance level α = 0.1, the critical value of F distribution is t α = 2.490 < τ F . Hereby null-hypothesis is rejected and we can assert that performances of four operators are significantly different.

Abstract
Background: Surface roughness has a significant effect on leaf wettability, consequently influencing the efficiency and effectiveness of pesticide spraying application. Therefore, surface roughness measure of plant leaves is conducive to relevant researches. In order to characterize the surface roughness, present methods have to draw support from large apparatus, but they are generally high-cost and not portable enough for field measurement. Methods those instruments even have potentially inherent drawback such as absence of relation between pixel intensity and corresponding height for scanning electron microscope (SEM).
Results: An imaging system with variable object distance is set up to capture images of plant leaves and a shape from focus (SFF) based method is proposed. These space-variantly blurred images are processed with the proposed algorithm to yield surface roughness of plant leaves. The algorithm mainly improves the current SFF method in image alignment, focus distortion correction, and NaN values introducing to make it applicative for precise 3d-reconstruction and surface roughness measure in small scale.
Conclusion: Compared with method via optical three-dimensional interference microscope, the proposed method preserves the overall topography of leaf surface and meanwhile achieves superior cost performance. Experiments on standard gauge blocks revealed the RMSE of step was approximately 4.44 µm. Furthermore, the focus measure operator SML was supposed to perform best according to Friedman/Nemenyi test.
Keywords: Shape from focus; 3d-reconstruction; surface roughness Background Improving the efficiency of pesticide is a classical issue in agricultural engineering due to concerns on environment, resource and costs. Notwithstanding divergence of definition [1], leaf wettability is the manifestation of submicron physicochemical interactions between the leaf surface and the droplet solution [2], i.e. the affinity of leaves to water or medicine. Extensive studies on leaf wettability aim to enhance the adhesion of droplets on the surface of leaves, averting the off-target deposits (e.g. rebound, roll, slide) resulting from adhesive characteristics of leaf surface. Leaf wettability may have significant difference in species, varieties, and even within a life cycle [3]. Plant leaves are rarely absolutely flat when observed at a high resolution. Previous researches found it was the chemical composite and microstructure of epicuticular wax formations on leaves that determined the leaf wettability, and generally, increasing surface roughness of a hydrophobic surface increased the Manuscript Click here to access/download;Manuscript;manuscript(2).pdf Click here to view linked References  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63  64  65 hydrophobic properties [4]. Contact angle is a common metric of leaf wettability; nevertheless, it does not yield much benefit to intensive researches on wettability. In order to quantify separate factors of the leaf wettability, surface roughness is viably used as an index rather than contact angle since wettability is a function of roughness. Surface roughness characterization is generally standardized in manufacturing industry, whereas unfortunately there is still not an intensively adopted method for leaves [5]. Pioneers attempted to model contact angle into roughness coefficient [6,7] by contrasting contact angle on rough surface with that on smooth surface, where the roughness coefficient was merely a relative and indirect assessment of roughness. Capable of providing high-resolution micrograph of leaf surface, scanning electron microscope (SEM) is one of viable options for obtaining leaf surface roughness with the help of visual analysis techniques (e.g. fractal dimensional analysis [1,8,9], Fourier descriptors [10,11], visual classification [12,13]). However, apart from the cumbersome preprocessing for samples before inspection, two-dimensional SEM micrograph based outcomes of visual analysis techniques require significant correlation between surface texture and roughness. Atomic force microscope (AFM) can probe the surface profile of leaves in tremendous accuracy with a stylus [4,14]. AFM, as well as most of stylus methods of characteristic surface roughness in manufacturing industry, possibly damages the surface, although ameliorated non-contact AFM was developed to alleviate the problem at the cost of accuracy. Besides, it solely accesses to one-dimensional profile within an action cycle and has an unacceptably narrow Z-range for leaves of many plant species. Optical profiler, a profiler based on optical principles such as phase shifting interferometry was used to render the surface of plant leaves [5,15]. Usually with interferometry, speckles formed by white light interferer compose optical sections, and then a leaf surface can be reconstructed from these sections. The procedure is commonly automatic and comparatively faster, but the most valuable information about the spatial organization of a surface can be lost [11]. Present methods of characterizing leaf surface roughness depend on expensive instruments, and have their own shortcomings to overcome. To address the issue, this study proposes a new approach to measure the leaf surface roughness based on shape from focus (SFF). As a textured rough surface moves with respect to a fixed imaging system, variation of image sharpness is observed, so we can recover the shape from the textured image. In this study, in order to reconstruct the surface topography where surface roughness is recovered, micrograph sequences of leaves captured by high-resolution digital camera with diverse focusing deviation are processed by SFF algorithm. The method proposed is notably low-cost compared to the previous methods, and in the meanwhile, it utilizes the 3-demensional information and has a noncontact process of measurement. To evaluate the performance of the method, this study also gave results of optical profiler as reference.

Materials
Steel gauge blocks with precisely customized thickness of 1000 µm to 1100 µm ± 0.2 µm at intervals of 10 µm, functioning as reference standards. The pairwise combination of 4 gauge blocks yielded 10 measurements at different step height: from 10 µm to 100 µm by 10 µm. The hybrid indica rice (Oryza sativa L.) grew from seed in an incubator whose leaves were sampled at seedling stage. The Rape (Brassica chinensis L.) was Zheda 618, breed by Zhejiang University. The cotton (Gossypium hirsutum L.) was uplond TM-1. The tobacco (Nicotiana tabacum L.) was benthamiana.
SFF imaging system (Figure 1) built in Agricultural Information Technology Institute of Zhejiang University (Hangzhou, PRC) to capture focused and defocused images. It comprised an optical experimental platform (Fenghua Technology Co., Ltd, Shenzhen, PRC), an industrial RGB camera (Tuoriweiye Technology Co., Ltd, Shenzhen, PRC), a portable microscope (Meijing Electronics Co., Ltd, Shanghai, PRC), and a micromotion objective table (Hengyang Electronic Technology Co., Ltd, Guangzhou, PRC). The resolution of the RGB camera was 3072 pixels × 2048 pixels. The portable microscope was capable of magnification of ×10, functioning as a camera lens. The micromotion objective table shifted leaf vertically in motion range of 12.5 mm and accuracy of 0.5 µm, screwed on the optical experimental platform. Manual adjustment to ratchet knob would lift the table in precision to create defocus. The camera was mounted on the portable microscope, upside down towards the table and images were transmitted to computer via Ethernet.
Wyko NT9100 (Veeco, NY, US) optical profiler utilized to measure of leaf roughness as a comparison. This instrument had a vertical scanning range of 0.1-10 nm, skilled in measuring 3d surface topography. The measurement was made in College of Optical Science and Engineering, Zhejiang University (Hangzhou, PRC), where the optical profiler was located. Immediately the defocused images were captured, leaves were move to optical profiler for observation. Benefited from the supporting software, we could facilely reconstruct the three-dimensional structure and obtain the surface roughness.
Shape from focus Shape from focus, an identical name of depth from focus (DFF), is hardly a spannew notion, but a classic involved problem in computer vision. SFF is widely studied as a passive method of estimating depth or shape from monocular focal cues [16,17,18,19].
With a setting of focus, difference in blurriness for objects along z-axis can be observed. It is one of the most obvious cues for human observers to understand depth in a two-dimensional image. As to the microscopic level, the radiation of each point on the object spread onto a fuzzy circle, which is usually described by point spread functions (PSFs), when projecting to the sensor plane ( Figure 2). In particular, if the focus locates on the sensor plane, the diameter of the fuzzy circle is infinitely close to zero. In this case, the projection of point M is a point since the sensor plane and the focal plane are coincident. For a constant focal length f and the fixed location of lens and sensor plane, with the objects moving away from the lens, the focus of the point M moves towards the lens and a fuzzy circle with a diameter of c is synchronously detected on the sensor plane.
Some methods, known as depth from defocus (DFD) or shape from defocus (SFD) methods, aim to compute the distance between the defocused object points and the sensor plane by estimating the diameter of fuzzy circles. These methods are fast 1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63  64  65 but inaccurate, with necessity of prior known intrinsic and extrinsic parameters as well. According to SFF theories, a focused image surface (FIS) is defined as the surface formed by a set of points at which the object points are focused by the lens [20]. It is determined in basis of sharpness at each pixel in a sequence of images with continuously varied focus level, where a focus measure function or operator is applied to compute the sharpness. Generally, the FIS comprises pixels where the corresponding frame gives the maximal sharpness among the images, in spite of various optimized techniques such as quadratic or Gaussian interpolation. With the FIS available, we can retrieve all the corresponding points on the object surface.

Image registration
Image registration targets on aligning two or more images of the same scene by warping and overlapping them geometrically. Work schemes in most SFF researches ignores the differences in view-of-field during the adjustment of focus level since the image registration results in a considerable cost. However, we propose to have the pixel offset corrected, taking into account the trade-off between the accuracy requirement and the computational complexity in roughness measure of leaf surface. The process of image registration is described as follows.
First, features are detected using Speed-Up Robust Feature (SURF) [21]. SURF has extensive applications in image registration for merits of rapidity and robustness. The strategies on Gaussian pyramid and Harr wavelet make the SURF descriptor scale-and rotation-invariant. SURF performs superiorly in vast image registration tasks compared with other present-day methods. Thus, we propose to use SURF as the feature detector in this scheme. For each image, SURF algorithm yields dozens of features expressed by 128-deminsional vectors.
Second, features with the images are matched based on the correspondence between the features within pairs of images. The so-called correspondence is a function, which characterizes the spatial relationship, of two features. As a frequently used distance, Euclidean distance is adopted in our scheme. By calculating the distance of every pair of features between neighbor frames, we can obtain the brute-force matches. The amount of matches indicates the clarity of the images to some extents. Points on images with few matches are largely defocused, which do not make sense for SFF. Therefore, for the sake of arithmetical simplification in subsequent procedure, the images without enough matches are removed from the dataset, where a threshold η is set according to the texture richness of leaf surface.
Then, homography matrices are estimated from the matches. Theoretically, the distortion of images resulting from the adjustment of focal level is no other than scaling, but in practice, it is observable that miscellaneous errors may incur the other forms of distortion. The homography matrix denotes the mapping relation between two image coordinates, which is a 3 × 3 matrix and independent of scalar multiplication. Thus, we have 8 parameters to estimate for each homography matrix, i.e. 4 matches are in need at least for a image to be transformed. In addition, Random Sample Consensus (RANSAC) works as a filter of matches in consideration of high sensitivity of homography to noise.
Eventually, the homography matrices are applied to the images for affine transformation. In this phase, all of the images remained (meeting the limit of the aforementioned threshold η) are transformed to the same image coordinate, e.g. that of   1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63  64  65 the first image. Linear interpolation is used to deal with non-integer pixel indices due to spatial discreteness of digital images. After the affine transformation, the pixel offsets among the image frames are eliminated so that an overlapping region emerges. For each image frame, the edge outside the overlapping region should be trimmed off, ensuring that there is no absent pixel.

Focus measure
Focus measure plays a dominant role in SFF since it produces the fundamental information for FIS. A great number of focus measure operators have been proposed in previous researches. Pertuz [17] summarized these focus measure operators by grouping them into 6 families. A noteworthy conclusion is that operators in Laplacian family has best overall performance at normal imaging conditions in spite of difficulties in determining the family who performs best under any imaging conditions. Some operators working in frequency domain also performs well, which is widely used in autofocus; nevertheless, time-consuming Fourier transformation is not in favor of SFF problems since focus measure is determined in local windows.
In this paper, we compared the performance of four spatial focus measure operators.

Energy of Laplacian (EL)
Energy of Laplacian considers the second derivative of image. It is calculated by summing the pixel intensity of an image convolved with a Laplacian mask. The sum is done within a local window for denoising purpose as: where L(·) denotes Laplacian transformation, and f (·) denotes the pixel intensity of image, and Ω(·) denotes the adjacent region (the same below).

Sum-modified Laplacian (SML)
Laplacian operator calculates the second partial derivatives of image with respect to x and y, which can be either positive or negative. Therefore, Energy of Laplacian may give a small response at pixels where the two partial derivatives at orthogonal directions cancel out. Different from Energy of Laplacian, Sum-modified Laplacian sums the energy of a window for an image convolving with a modified Laplacian (ML) operator, which is defined as the sum of absolute value of second partial derivatives: where L m (·) denotes modified Laplacian transformation .   1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63  64  65 Tenenbaum gradient (TG) Tenenbaum [22] proposed a Sobel operator based focus measure operator, which is named Tenenbaum gradient. Sobel operator is well known as an edge detector. It has two forms which produce first derivative of image with respect to x and y, respectively. By summing the length of gradient within a window, Tenenbaum gradient is obtained as: Gray level variance (GLV) Gray level variance is a statistic operator, based on the assumption that point at FIS has the maximal gray level variance. Definitely, it is also calculated in local windows: where µ denotes the mean pixel intensity of the window.

FIS searching
Principally, the most straightforward way to determine the FIS is searching for the maxima of focus measure at each pixel [23]. It is fast and simple to implement, but the resolution of FIS depends on the movement interval of object with respect to the image detector, i.e. the reconstructed surface looks discrete. A great many of SFF researches inclines to interpolate the points nearby the maximum, where useful interpolation techniques including quadratic [24] and Gaussian [25] interpolation are intensively adopted. Interpolation methods aim at smoothing the FIS, but, as well as the former approach, a maxima caused by annoying noise may distort it. Besides, interpolation methods are time-consuming though there are closed form solutions for particular situations. Some optimized the FIS using tensor voting [26,27] but the algorithm iterates every token from depth cloud and computes the eigenvalues, not favoring the problem at hand. In this paper, we proposed a simple but efficient and robust method for FIS searching using weighted average of stereo blurred focus measure.
We have obtained the focus measure values for each pixel after applying any of the aforementioned focus measure operators according to our workflow. Considering the potentially abnormal values because of noise, we blur the original focus measure values with a stereo mean filter.
In order to search the FIS fast and precisely, we are supposed to make full use of the focus measure information without increasing difficulty of computation. For pixels in different image frames at the same pixel coordinate, the focus measure values are sorted from largest to smallest. Then, the FIS is calculated by formula: where ϕ m denotes the first m frame numbers of sorted focus measure values, and d(·) maps the frame number to corresponding motional amount of micromotion table, and σ d denotes the standard deviation of d(α), and ε is a manual setting threshold, and NaN (Not a Number) represents an invalid value. The Formula (9) indicates that FIS is determined by weighted average of d, where blurred focus measure functions as the weight. The method proposed considers the score of image frame where FIS probably is and yields the comprehensively predicted location. Large standard deviation of d(α) implies unreliability of focus measure. Thus, fixing invalid values to FIS on these pixels is an approach to prevent them from involved in following computation. Figure 3 shows the performance of the proposed FIS searching method on cotton leaf surface. It is evident that this method is less sensitive to maximum compared with interpolation methods.
The SFF procedures in most applications used to be finished as long as the FIS was found, and the subsequent measurement of surface feature parameters was taken for granted. In practice, an abnormal bulge emerged from the central region of the reconstructed surface. Drift of FIS is potentially ubiquitous in SFF because of the defects on fabrication of camera, assembly accuracy of microscope and discrepancies on sensitivity of sensing units. The effect is usually too slight to pay attention to, but it gets prominent in microscopic applications. To eliminate the distortion, the difference of the primary FIS and a drift field substitutes the primary FIS, where the drift field denotes the strongly blurred FIS of a flat surface. This work will not be cumbersome duplicates, as the measure of the drift field, i.e., system calibration, is done once for all as long as the system is established. Figure 4 demonstrates the effective correction of a flat surface with scratches of aluminum alloy. An apparently abnormal uplift of FIS is restrained in the undistorted image.

Results and discussion
Surface render comparisons on leaves Crop plant leaves of various species were observed in both optical profiler and the SFF system. Figure 5 shows the surface rendering comparison of these two methods, where the height of surface gets higher as the color ranges from cool tone to warm tone and SML operator is chosen for focus measure in SFF renderings. It should be noted that it is hard to exhibit the identical region of interest both in two methods due to difference in field of view, so we have to make a qualitative evaluation according to their comprehensive performance comparison.  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63  64  65 Plant leaves ordinarily present fractal structure. As is shown in Figure 5 (a, b), rice leaf surface comprises complex microstructures including parallel wavy veins, hairy trichomes, quasi-1D arranged micropapillae and patchy protuberances. An inherent drawback for SSF is that it performs mediocrely in details and abrupt changes due to local windows remarkably for focus measure calculation. Those details are comparatively suppressed in the SFF rendering; nevertheless, it still preserves some detailed information. The SFF rendering demonstrates a distinctly higher altitude where the trichomes and protuberances are visually supposed to be. Relatively, the veins are well defined in SFF reconstruction since larger scale of scenes are less sensitive to local windows.
It is evident that the field of view of optical profiler is much smaller than that of the SFF system. Optical profiler does not render more than one vein at an observation, while the SFF system offers a rendering of several veins, taking the advantage of a wider field of view. Since the wavy veins on rice leaves are considerably different in size, the accuracy of roughness measure for optical profiler are largely dependent on the selection of single vein. By contrast, such dependence abates for the SFF system. Figure 6 shows the computer renderings of rape, tobacco, cotton and rice leaf surface. Dicotyledons are characterized by reticular veins. Leaf surface trends to be flat at where the veins are sparse, which is well demonstrated in the renderings. Table 1 shows the areal surface roughness of plant leaf surface displayed in Figure  6, where the arithmetical mean height based areal surface roughness is calculated by:

Quantitative analysis on gauge blocks
It is a knotty problem to measure identical micro-surface profile on different instruments. In order to quantify the performance of the SFF system for each FIS method, the experiments were conducted by pairwise combining the gauge blocks along their edges, creating a step at the shared border. Figure 7 shows the diagram of experimental measure where the microscope aimed at the seam of the blocks. The pairwise combination of gauge blocks yields 10 measurements at different step height: 10-100 µm at intervals of 10 µm. For each measurement, we took 5 approximately 10, 000 samples from the FIS at both step faces respectively and calculated the height of the step. The number of samples depends on the alignment and NaN values since we sampled in columns. RMSE (Root Mean Square Error) and Pearson correlation were computed using ground truth and reconstructed depth map. Smaller RMSE and larger correlation indicate a better performance. If f (m, n) and h(m, n) are computed and actual step height of m-th sample of n-th measurement respectively, the RMSE and the Pearson correlation r are given by 1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62 63 64 (12) Figure 8 shows the comparison on RMSE and Pearson correlation of our proposed SFF method and traditional SFF method with different focus measure operators. It is evident that the proposed SFF method presents a superior performance for its smaller RMSE and larger correlation. We believe that the amelioration on distortion contributes to the significant improvement on performance. Moreover, system robustness benefits from the introduction of NaN value and design of weighted average in focus measure. For our proposed SFF method, the operator SML shows an outstanding performance compared with other operators since the RMSE of SML is lower than 4.44 µm. The correlations of four operators are closed to 1, indicating that the estimated depth is well correlated to ground truth, though the operator GLV shows a relatively smaller correlation.
Friedman test and post-hoc Nemenyi [28] test were adopted as a further compared evaluation of the focus measure operators. Friedman test is a non-parametric hypothesis test. If r i is the average rank of i-th algorithm, and k is the number of algorithms, and N is number of datasets, the statistic is subject to F distribution, where Here, it ranks the performance of the four operators for each dataset, and calculates the average rank of each operator. Table 2 shows the rank table of four operators, and unsurprisingly, the operator SML ranks first in most of datasets. According to Formula (13) (14), we can calculate the statistic τ F = 26.714. Taking the significance level α = 0.1, the critical value of F distribution is t α = 2.490 < τ F . Hereby null-hypothesis is rejected and we can assert that performances of four operators are significantly different.