Plot size effects on airborne LiDAR-derived metrics and predicted model performance of subtropical planted forest stand attributes


 Background

Field plot measurement is an essential task for forest inventory and monitoring and ecological applications based on airborne LiDAR. To optimize the field plot size and reduce cost, it is necessary to investigate the influence of field plot size on LiDAR-derived metrics and the accuracy of forest parameter estimation models.
Methods

A subtropical planted forest with an area of 4,770 ha was used as the study site, and 104 square plot of 900 m2 (30 m×30 m, subdivided into nine quadrats, each with an area of 100 m2 (10 m×10 m)) was divided into field plots with six different areas (100 m2, 200 m2, 300 m2, 400 m2, 600 m2 and 900 m2) by grouping quadrats. The differences in the LiDAR-derived metrics and stand attributes of different sized plots with four forest types (Chinese fir, pine, eucalyptus and broadleaf) were investigated. Through multivariate power models with stable structures, the differences in forest parameter (BA, VOL) estimation accuracies for plots with different sizes were compared.
Results

(1) The mean differences in LiDAR-derived metrics related to height, density and vertical structure between the plots with different sizes and the 900 m2 plot containing all forest types were very small, and when the plot size changed, these differences changed irregularly; however, the standard deviations of the differences increased rapidly with decreasing plot size. (2) There were significant differences in the mean of the maximal height of the point cloud (Hmax), density of the 75th percentile of the point cloud (dh75) and mean leaf area density (LADmean) (except for Chinese fir and eucalyptus) between the plots with different sizes and the 900 m2 plot containing all forest types; other LiDAR-derived metrics had significant differences in only some or a certain size of plots, but there was no regularity. (3) Except for the maximal tree height of the plot (Hm), the forest stand attributes, including the mean tree height (H), diameter at breast height (DBH), basal area (BA), and stand volume (VOL), of all forest types showed either no significant differences or minimal differences between plots with different sizes and the 900 m2 plot. (4) With increasing plot size, the coefficient of determination (R2) of the estimation models for VOL and BA of all forest types increased gradually, while the relative root mean square error (rRMSE) and mean prediction error (MPE) decreased gradually, and the estimation accuracy of the models improved.
Conclusion

Due to the heterogeneity of the vertical and horizontal forest structures, some LiDAR-derived metrics and stand parameters for field plots with different sizes varied. As the plot size increased, the variations in the independent variables (LiDAR-derived metrics) and dependent variables (stand parameters) of the estimation models decreased gradually. These changes improved the robustness and accuracy of the models. In the application of airborne LiDAR in forest inventory and monitoring, both prediction accuracy and cost should be considered. For subtropical planted forests, we preliminarily suggest the following appropriate sizes for field plots: 900 m2 for Chinese fir and pine forests, 400 m2 for eucalyptus forests and 600 m2 for broadleaf forests. However, this protocol still needs to be tested in further studies.


150
According to the above mentioned combinations, we obtained four datasets; each forest type contained 151 22-29 field plots with different areas. The stand parameters of each field plot were calculated based on the 152 field data. The following method was applied. For each field plot with a given area (such as 400 m 2 ), BA and 153 VOL were the sums of the corresponding values in the quadrats (e.g., for protocol 1 (in Fig. 1 and Table 2), 154 the 400 m 2 field plot contained quadrats 1, 2, 5 and 6; for protocol 2, the 400 m 2 field plot contained quadrats 155 1-6 ( Table 2)). DBH and H were weighted averages of the BAs of the corresponding volume of the quadrats 156 included in the plot, and Hm was the maximal height in all quadrats included. 7 error of the laser point cloud height was less than 0.15 m. In the LiDAR data preprocessing, the point clouds 164 were labeled as ground return and nonground return data, and the latter were used to generate the digital 165 surface model (DSM). The former was used for the digital elevation model (DEM) at a pixel size of 2 m × 2 166 m using a triangulated irregular network (TIN) interpolation algorithm. Using the DEM, the influence of 167 topography was removed, and the DEM normalized vegetation point cloud data were obtained. 168 According to the coordinates of the four corners of the 900 m 2 field plot, we extracted the normalized 169 vegetation point cloud data within each smaller field plot to calculate LiDAR-derived metrics, e.g., height 170 and density statistical characteristics of the laser point cloud data and the mean leaf area density of the stand 171 canopy and its coefficient of variation (CV) (Bouvier et al., 2015). Some researchers assert that the first 172 LiDAR echoes represent the key part of the reflected signal; compared to other echoes, the first echo yields 173 extracted metrics that can fully satisfy the need to estimate biomass and may produce a higher estimation 174 accuracy (Singh et al., 2016, Chen et al., 2012, Kim et al., 2016. However, this study extracted LiDAR-175 derived metrics from all laser echoes. 176 By employing the interpolation method, we obtained the coordinates of the four corners of the quadrats 177 in each field plot. According to the quadrats contained in the six field plots with different areas corresponding 178 to each scheme, we calculated the LiDAR-derived metrics of plots with different sizes using the same method 179 as that utilized to calculate the metrics of the 900 m 2 field plot. 180 Comparative analysis of plot size effects 181 To evaluate the effects of plot size on LiDAR-derived metrics, two-tailed paired sample t-tests were 182 employed to analyze the means of LiDAR-derived metrics between the smaller plots (100 m 2 , 200 m 2 , 300 183 m 2 , 400 m 2 and 600 m 2 ) and the 900 m 2 plot for all datasets and all forest types. These metrics included mean 184 point cloud height (Hmean); 25 th , 50 th and 75 th percentile heights (hp25, hp50 and hp75); maximum height 185 (Hmax); CV of point cloud height (Hcv); canopy cover (CC); 25 th , 50 th and 75 th percentile densities (dh25, 186 dh50 and dh75); and the means of leaf area density (LADmean) and its CV (LADcv). Then, the numbers of 187 significant differences for each metric in the four datasets were statistically analyzed. 188 By employing a method similar to that described above, we statistically analyzed the means of the stand 189 attributes (DBH, H, Hm, BA and VOL) between plots of different sizes and the 900 m 2 plot for all four 190 datasets and all forest types. 191 To assess the effect of plot size on the performance of the stand attribute estimation models, we built 192 VOL and BA estimation models for all forest types by using the LiDAR-derived metrics: Hmean, CC,193 LADcv, Hcv and dh50. The structural formula is shown as follows:

205
Effects of plot size on LiDAR-derived metrics 206

Height metrics 207
Among the four datasets, the mean of the differences in the LiDAR-derived height metrics (hp25, hp50, 208 hp75, Hmean, Hmax and Hcv) between the plots with different sizes (600 m 2 , 400 m 2 , 300 m 2 , 200 m 2 and 209 100 m 2 ) and the 900 m 2 plot of all forest types were very small, and their standard deviations were 210 approximately one order of magnitude larger than the mean differences. As the plot size decreased, the mean 211 differences showed irregular variations; however, the standard deviation of the difference tended to increase 212 rapidly. Fig. 3a shows the change in the mean and the standard deviation of the difference in Hmean for the 213 Chinese fir forest between plots with different sizes and the 900 m 2 plot. 214

(d) LADcv of Broadleaf
Paired t-tests were performed to test the means of differences in six LiDAR-derived height metrics for 219 plots with different sizes (600 vs. 900 m 2 , 400 vs. 900 m 2 , 300 vs. 900 m 2 , 200 vs. 900 m 2 and 100 vs. 900 220 m 2 ) in each dataset. There were four datasets; thus, four tests were performed. Then, we counted the number 221 of significant differences in these six metrics. The results (Table 3) are described as follows. 1) For all forest 222 types, the number of significant differences in the Hmax means between plots with different sizes and the 223 900 m 2 plot was 4, which implied that for all forest types, the mean Hmax difference for all plots with various 224 sizes were significantly different (α=0.05) from that of the 900 m 2 plot. 2) For the remaining five height 225 metrics, the maximum number of significant differences was 2, which indicated that there were no significant 226 differences in the means of these metrics between plots with various sizes and the 900 m 2 plot. 227 Table 3 Frequency statistics for significant differences (α≤0.05) in paired sample t-tests for the means of the LiDAR- The LiDAR-derived height metrics varied with the areas of the field plots, but the variations differed 231 for different forest types. In all four datasets for Chinese fir forests and eucalyptus forests, there were no 232 significant differences in the means of Hmean values between plots with different sizes and the 900 m 2 plot. 233 However, for the means of hp25, hp50, hp75 and Hcv, there were a few irregular significant differences. For 234 pine forests, there were no significant differences in the means of hp75 among the field plots with different 235 sizes, while the means of hp25, hp50, Hmean and Hcv showed one to two significant differences in the four 236 datasets, but these significant values all appeared in different datasets and without obvious regularity. In 237 broadleaf forests, there were no significant differences in the means of Hmean values among the field plots 238 with different sizes; the results for other metrics were similar to those for pine forests, and these results also 239 lacked any obvious regularity. The variations in the point cloud height metrics among field plots with the 240 different sizes mentioned above can be summarized as follows. 1) In general, there were no significant 241 differences for the means of the LiDAR-derived height metrics between the plots with various sizes and the 242 900 m 2 plot, except for Hmax; 2) Hmean and Hcv seldom showed a significant difference among the plots 243 with different sizes; 3) the probabilities of significant differences in laser point cloud height metrics found in 244 pine forests and broadleaved forests were higher than those found in Chinese fir forests and eucalyptus 245 forests; 4) the possibilities of significant differences in metrics representing the heights of the middle and 246 low canopy layers (hp25 and hp50) were much higher than those of the metrics representing the height of the 247 middle to upper canopy layer (hp75) (mainly found in pine forests). 248 For plots of all sizes, the means of Hmax were significantly different from that of the 900 m 2 plot, which 249 indicated that Hmax was extremely unstable and thus was not suitable to serve as an indicator for estimating 250 forest stand parameters (Gobakken and Naesset, 2008). 251 Further analysis indicated that (Table 4) 1) as the plot size increased, the standard deviations of hp50 252 and Hmean for all forest types decreased gradually, and when the plot size was ≥400 m 2 , the standard 253 deviations of these two metrics were very close, decreasing slightly with increasing plot size; and 2) for field 254 plots of all different sizes, the standard deviations of Hcv remained almost unchanged. 255

Density metrics 257
Similar to the point cloud height metrics, all density metrics (CC, dh25, dh50 and dh75) between the 258 plots with different sizes and the 900 m 2 plot for all forest types had mean differences that were very small. 259 The standard deviations of the differences in all density metrics were approximately one order of magnitude 260 larger than their mean differences. As the plot size decreased from 600 m 2 to 100 m 2 , the mean differences 261 changed irregularly, although the standard deviation tended to increase rapidly. Fig. 2b shows the change in 262 the mean and standard deviation of the differences in CC of the pine forest between the plots with different 263 sizes and the 900 m 2 plot in the four datasets. Fig. 2c shows the same changes in the dh50 of eucalyptus 264

forest. 265
For all forest types, the mean and standard deviation of the differences in CC between plots with 266 different sizes and the 900 m 2 plot were the smallest among all density metrics. There were only two 267 significant differences in the pine forests and one significant difference in the eucalyptus forest. The dh25 268 values showed one to two significant differences in the 300 m 2 , 200 m 2 and 100 m 2 pine forest plots (Table  269 3). These results indicated that there were no significant differences in CC and dh25 among the plots with 270 different sizes for all types of forests. In total, 1-4 significant differences were found between the plots with 271 various sizes and the 900 m 2 plot in fir, pine and broadleaf forests, which indicated that dh50 varied widely 272 among the plots with different sizes in these three forest types. For Chinese fir, pine and broadleaf forests, 273 when the plot size was less than or equal to 400 m 2 , four significant differences were found for dh75, which 274 indicated that in the dataset for each of these three forest types, dh75 in the plots with sizes less than or equal 275 to 400 m 2 were totally different from that in the 900 m 2 plot. There were also 2-3 significant differences 276 present for dh75 in the 600 m 2 plot. For the eucalyptus forest, no significant difference in dh75 was present 277 in the 600 m 2 plot, but there were 1-2 significant differences found in the plots with other sizes. The results 278 of paired t-tests conducted for the density metrics of the plots with different sizes mentioned above can be 279 summarized as follows: 1) there were no regular significant differences in CC and the percentile density of 280 the lower layer (dh25) for all forest types between the plots with various sizes and the 900 m 2 plot, but the 281 percentile density of the upper layer (dh75) was not the same (except for eucalyptus forest); 2) for dh50, all 282 forest types other than eucalyptus forest yielded some significant differences between the plots with various 283 sizes and the 900 m 2 plot, although they were irregular. 284 Table 4 shows that for all types of forests, the standard deviations of the main density metrics (CC and 285 dh50) remained almost unchanged in the field plots with different sizes. 286

Leaf area density metrics 287
Unlike height and density metrics, the vertical structure metrics (LADmean and LADcv) for all types 288 of forests had mean differences between plots with various sizes and the 900 m 2 plot that gradually decreased 289 as the plot size decreased, and their standard deviations increased rapidly with decreasing plot size. Fig. 3d  290 shows how the means and standard deviations of the differences in LADcv for broadleaf forests between the 291 plots with different sizes and the 900 m 2 plot changed with a decrease in plot size. When the plot size 292 increased from 100 m 2 to 900 m 2 , the standard deviations of LADcv for all types of forest gradually 293 decreased; in particular, the standard deviations of LADcv were very close for the plots with areas of 400 m 2 , 294 600 m 2 and 900 m 2 (Tab. 4). 295 In the four datasets, the number of significant differences presented for the means of LADmean for pine 296 and broadleaf forests between the plots with different sizes and the 900 m 2 plot ranged from 2 to 4 (Table 3), 297 which indicated that for these two types of forest, the mean of LADmean of the plots with different sizes 298 were quite different from that of the 900 m 2 plot. For the Chinese fir forest, there was no great difference 299 between the 600 m 2 and 900 m 2 plots in terms of LADmean, while in plots with other sizes, LADmean 300 showed one to three significant differences. When the plot size was less than or equal to 300 m 2 , the 301 eucalyptus forest was not significantly different from that of the 900 m 2 field plot in terms of LADmean. 302 There were no significant differences in LADcv between the size of the plot and the 900 m 2 plot as follows: 303 less than or equal to 200 m 2 for eucalyptus and broadleaf forests and greater than or equal to 400 m 2 for pine 304 forest. In the plots with other sizes for these three forest types and plots with all sizes for fir forest, LADcv 305 values yielded 1-4 significant differences. These results suggested that the vertical structure of the stand 306 canopy was more homogeneous for eucalyptus and broadleaf forests than for pine and Chinese fir forests. 307

Effect of plot size on the stand parameters of field plots 308
Similar to the LiDAR-derived metrics, the stand parameters (DBH, H, Hm, BA and VOL) for all types 309 of forest had mean differences between the plots with different sizes and the 900 m 2 plot that were very small 310 and varied irregularly as the plot size decreased. However, their standard deviations of the differences were 311 much larger and increased rapidly with decreasing plot size (Fig. 4). 312  for protocols 1, 2, 3 and 4, respectively. (a)   The results of paired t-tests showed that for the four types of forest, the means of Hm for the plots with 317 different sizes were significantly different from that of the 900 m 2 plot in most of the datasets. Among other 318 stand parameters, significant differences were found in only a few datasets. These results suggested that 319 except for the means of Hm, which were significantly different between the plots with different sizes and the 320 900 m 2 plot, the stand parameters had either no significant difference or almost no significant difference 321 between the plots with different sizes and the 900 m 2 plot. 322 When the plot size increased from 100 m 2 to 900 m 2 , the standard deviations of the main stand attributes 323 (H, VOL and BA) for all types of forest were found to decrease gradually (Table 4), which suggested that 324 with increasing plot size, the variation in the stand parameters tended to decrease. 325 Effects on the performance of the prediction model of forest inventory attributes 326 In general, the differences in the estimated VOL and BA for all four types of forest between the plots 327 with different sizes and the 900 m 2 plot decreased with increasing plot size, and the differences in VOL were 328 greater than those in BA. The maximal differences in VOL and BA for Chinese fir forest were 7.38% and -329 7.6%; pine forest, -14.38% and -8.66%; eucalyptus forest, -12.57% and -9.48%; and eucalyptus forest, -330 10.07% and -8.20%, respectively. In addition, with decreasing plot size, the standard deviations of the 331 estimated VOL and BA for all forest types increased overall. 332 The results of paired t-tests showed that although there were some significant differences in the means 333 of estimated VOL and BA for all four types of forest between several plots with different sizes and the 900 334 m 2 plot in certain datasets, these differences were irregular; in general, the means of estimated VOL and BA 335 for the plots with different sizes were not significantly different from those of the 900 m 2 plot. However, after 336 calculating the statistical means of the goodness-of-fit and the accuracy of the VOL and BA estimation 337 models for all four types of forest in the plots with different sizes for the four datasets, we found that as the 338 plot size increased, the R 2 of the VOL and BA prediction models for all four types of forest increased 339 gradually, while both rRMSE and MPE decreased gradually (Table 5). When the plot size was 900 m 2 , R 2 340 was maximum, and rRMSE and MPE were minimum. As the plot size increased from 100 m 2 to 900 m 2 , the 341 accuracy of the VOL and BA estimation models gradually improved. 342 When the plot size increased from 100 m 2 to 200 m 2 , the increases in R 2 of the VOL and BA estimation 345 models for all types of forest were maximal, while the decreases in rRMSE and MPE were maximal. For all 346 types of forest, when the plot size was larger than or equal to 200 m 2 , the rRMSE and MPE of the VOL and 347 BA estimation models showed almost the same decreases. 348 In general, there was a good power function relationship between the rRMSEs of the VOL and BA 349 estimation models for all four types of forest and the sizes (ha) of the plots (Fig. 5)  mean height, mean diameter, basal area and stand volume, showed no significant differences or almost no 367 significant differences. 368 We inferred that the complex compositions of forest tree species, their uneven distributions and their 369 differences in growth led to heterogeneity in the vertical structure (e.g., single-layer and multilayer forests) 370 and horizontal structure (gaps and forest trees with different diameters). This heterogeneity resulted in uneven 371 vertical and horizontal distributions of laser point clouds, which caused differences in the LiDAR-derived 372 metrics of the plots with different sizes mentioned above. Specifically, 1) since the vertical and horizontal 373 structures of forest stands at different sites were different and the number of laser point clouds decreased with 374 decreasing plot size, the heterogeneity of the vertical and horizontal distributions of the laser point clouds in 375 plots with different sizes increased. Although the mean differences were small, their standard deviations 376 increased. 2) Although the trees in a stand were planted in the same year, and they did not grow at the same 377 rate; thus, the stand canopy surface was always uneven. When the plot size increased, the probability of 378 finding taller trees increased, which further increased the heterogeneity of the middle and upper canopies. or stand parameters demonstrated no regular significant differences among plots with various sizes, a few 387 significant differences occurred in plots with some sizes for certain forest types. 388 The abovementioned variations in LiDAR-derived metrics and stand parameters in the plots with 389 different sizes for all types of forests and the analysis of these variations could help explain how the plot sizes 390 affected the performance of the forest parameter estimation model. 391 In previous studies that addressed how plot size affected the accuracy of estimating forest parameters 392 with LiDAR, most of the field plots were circular. By setting concentric plots with different diameters 393 (Gobakken and Naesset, 2008) or using a compass or an electronic total station for tree positioning, these 394 analyses were conducted by simulations of field plots in the shape of concentric circles (Watt et al., 2013;395 Ruiz et al., 2014). The benefits were that different sized field plots had the same center, and they completely 396 overlapped near the center point. These features meant that the plot data were highly comparable. The 397 drawback was that identifying the field plot boundaries was difficult. In particular, tropical and subtropical 398 mountainous or hilly terrain was characterized by great changes in slope surface and lush understory 399 vegetation, and errors in the measurement of boundary trees were likely to increase. Highly accurate 400 positioning of sample trees was also required. In this study, we employed 30 m×30 m square plots. Various 401 plots with areas of 100 m 2 , 200 m 2 , 300 m 2 , 400 m 2 , 600 m 2 and 900 m 2 that each had six combinations of 402 quadrats were selected for analysis. The advantage of this method was that it enabled simple and accurate 403 boundary location, which effectively guaranteed plot data precision. The disadvantage of this method was 404 that due to inadequate overlap between field plots, the common portion was not located in the center of the The authors declare that they have no competing interests.