An Approximate Point-Based Alternative for the Estimation of Variance under Big BAF Sampling

Background: A new variance estimator is derived and tested for big BAF (Basal Area Factor) sampling which is a forest inventory system that utilizes two BAF sizes, a small BAF for tree counts and a larger BAF on which tree measurements are made usually including DBHs and heights needed for volume estimation. Methods: The new estimator is derived using the Delta method from an existing formulation of the big BAF estimator as consisting of three sample means. The new formula is compared to existing big BAF estimators including a popular estimator based on Bruce’s formula. Results: Several computer simulation studies were conducted comparing the new variance estimator to all known variance estimators for big BAF currently in the forest inventory literature. In simulations the new estimator performed well and comparably to existing variance formulas. Conclusions: A possible advantage of the new estimator is that it does not require the assumption of negligible correlation between basal area counts on the small BAF factor and volume-basal area ratios based on the large BAF factor selection trees, an assumption required by all previous big BAF variance estimation formulas. Although this correlation was negligible on the simulation stands used in this study, it is conceivable that the correlation could be signiﬁcant in some forest types, such as those in which the DBH-height relationship can be aﬀected substantially by density perhaps through competition. We also mathematically derived expressions for bias in the big BAF estimator that can be used to show the bias approaches zero in large samples on the order of 1 n where n is the number of sample points.


Background
The big BAF (Basal Area Factor) estimator for forest inventory is based on horizontal point sampling (HPS) using angle gauges having two different BAFs at each sample point in the field-the smaller BAF angle gauge BAF c is used to obtain a count of qualifying trees at each sample point and the larger BAF angle gauge BAF v is used to select sample trees on which careful measurements are made. Usually these careful measurements include measurement of DBH and height from which tree volume, weight or biomass can be estimated (Bell et al, 1983;Bruce, 1961;Oderwald and Jones, 1992). Although big BAF sampling is a form of double sampling, as Marshall et al (2004) indicated, it differs from some common forms of double sampling in forest inventory such as double sampling with regression estimators because both samples are taken at each point, so that the small sample is not simply a subset of the large sample point locations (discussions of double sampling in forest inventory include Gregoire and Valentine (2008, p. 262) and de Vries (1986, p. 164) among others).
As indicated by Iles (2012) the history of big BAF sampling may go back to Grosenbaugh (1952, p. 53). An early proposal for using two prism factors in big BAF sampling was given by Bell et al (1983, p. 702) and later a detailed description of big BAF sampling was given by Marshall et al (2004). A detailed review of the history of big BAF sampling was given a recent treatment by Gove et al (2020) who compared variance estimation methods which have been proposed for the method. Recent texts of forest sampling and mensuration which include descriptions of big BAF sampling include Gregoire and Valentine (2008, p. 268) and Kershaw et al (2016, p. 377).
Big BAF methods have been used operationally to inventory forests in the USA and Canada in both western and eastern forest types (Corrin, 1998;Desmarais, 2002). Brooks (2006) compared combinations of 13 "big" BAFs and 6 "small" BAFs to an inventory using fixed-radius plots. Rice et al (2014) compared a number of forest sampling methods including big BAF, HPS with various BAFs, horizontal line sampling and fixed-radius plot sampling. These studies were conducted in partial harvests in mixed species Acadian forests of northern Maine. A comparison of the results of these forest inventories showed that only the smallest BAF for HPS had a standard error smaller than the big BAF inventory. Methods for determination of optimal sampling plans for big BAF were described by Yang et al (2017). These methods allow for choice of optimal combinations of BAFs and sample sizes for big BAF according to economic criteria. Chen et al (2019) used these results to devise practical cost-efficient plans for estimation of forest carbon content using big BAF sampling for forest populations in the northeastern USA. Yang and Burkhart (2019) compared big BAF sampling to two other methods of subsampling count trees on point samples using simulated loblolly pine (Pinus taeda L.) plantations and found all three methods were satisfactory for estimating stand volume.
Despite the successes and evident utility of big BAF sampling variance estimation remains challenging. Moreover, the basic estimator associated with big BAF sampling is not, itself, design-unbiased. Gove et al (2020) showed that one of the traditional methods for estimation of the variance in big BAF sampling could be derived using the Delta method if the covariance terms are assumed to be negligible. As described by Gove et al (2020) the history of the Delta method, which is based on using a Taylor series approximation for nonlinear functions of random variables, has been traced by Ver Hoef (2012). Wolter (2007, p. 231) states that utilization of the method with a first-order approximation as done in this article has often provided satisfactory variance estimates for large complex surveys. The primary objective of this study is use of the Delta method to derive a new variance estimation formula for big BAF sampling that takes correlations among important sampling variables into account. As indicated below the traditional approaches to variance estimation for big BAF sampling ignore possible covariances between count basal area obtained by using BAF c and the sample mean volume per tree on trees sampled with BAF v . An additional objective of this study is to test this newly derived estimator using Monte Carlo simulations to compare it with the variance estimators that have been previously proposed for big BAF sampling. We also derive expressions for the bias associated with the big BAF estimator.

Basic big BAF estimation formulas
Two BAF factors are needed for big BAF, a small BAF factor F c used to select trees which are counted but not measured and a larger BAF factor F v used to select trees on which measurements are made usually including dbh and height for volume, weight or biomass estimation.
To obtain the big BAF estimator we first express the volume to basal area ratio (VBAR) for each tree i selected by the larger BAF, F v : where v i is the volume of tree i and b i is the basal area of tree i where there are i = 1, 2, ..., m vs trees on each of a sample of s = 1, 2, ..., n points (e.g. Kershaw et al, 2016, p. 377). The average VBAR is then (Gregoire and Valentine, 2008, p. 258): where m v = n s=1 m vs is the total number of volume trees sampled on all points. Note that it is theoretically possible that the same tree may be sampled from more than one point and thus could possibly be counted multiple times. The average basal area per hectare for the entire sample is; where m c = n s=1 m cs and m cs is the number of count sample trees counted at point s using the smaller angle gauge BAF c . Total basal area on a tract of size A is then: The big BAF volume estimator can then be obtained by multiplying the sample mean VBAR by the count-based basal area as: Gove et al (2020) and Iles (2012) have indicated that the variance for the big BAF volume estimator cannot be obtained by using the traditional formula for double sampling from survey sampling theory and practice. This is because the smaller sample in big BAF sampling is not a smaller selection of the total number of sample points but instead the smaller sample actually occurs at each sample point, thus the point-wise sample sizes utilized by the sample survey double sampling formula are equal, which is contrary to the requirements of that formula. However, because the estimator above is a product of two random variables one may employ standard methods for expressing the variance of a product. By applying the formula of Goodman (1960) for the variance of a product as noted by Gove et al (2020) one obtains: This formula assumes that the covariance betweenV andB c is zero or negligible for practical purposes. This may be expected to be true if VBARs are not greatly affected by density in the forest population being sampled. The variance estimators ofV andB c are given by Gregoire and Valentine (2008, p. 256-257) as: and var B c = 1 n(n − 1) The percent standard error can be found using the formula of Bruce (1961): As indicated by Gove et al (2020), Marshall et al (2004), and Gregoire and Valentine (2008, p. 259) Bell and Alexander (1957) were the first to present the version of the product variance for standard error computation presented above. Gove et al (2020) discussed the historical background of this formula and show how it can be derived using the Delta method (Ver Hoef, 2012) which is based on a Taylor's series approximation and has often been used to approximate the variance of a function of one or more random variables (e.g. Kendall and Stuart, 1977, p. 247, equation 10.12). It has been noted by Gove et al (2020), Gregoire and Valentine (2008, p. 259), Marshall et al (2004) and Iles (2012) that there is close agreement between the variance derived from (11) and the variance derived from Goodman's formula in equation (6) because the third term in the latter equation is typically small and dominated by the other two terms. Gregoire and Valentine (2008) (equation 8.33) have noted that the big BAF volume per hectare estimator can be formulated as follows: This formulation of the Big BAF estimator is based on three sample means. Gregoire and Valentine (2008) discuss alternative estimators for big BAF sampling including estimators based on Bruce's traditional formula (Bell and Alexander, 1957) and the Goodman (1962) formula for the variance of products of random variables. Equation (7) (same as equation 6 of Gove et al, 2020) computes the variance of the mean volume basal area ratio as a mean of ratios. However the number of individual tree ratios m v is a random variable. In the classic formulations of the mean of ratios estimator in the context of design-based sample survey sampling (e.g. Schreuder et al, 1993, p. 89) the number of sample ratios is fixed rather than random. As recognized by Gregoire and Valentine (2008, p. 258-259) equation (12) provides the opportunity to formulate the average volume basal area as the ratio of means becauseV v is the mean volume per sample point andB v is the mean basal area per sample point when the large basal area factor F v is used for tree selection A classical estimate for the variance of this estimated ratio according to equation 6.13 in Cochran (1977, p. 155) is: As indicated by Sukhatme et al (equations (42) and (43) 1984, p. 99) this variance estimator is algebraically equivalent to the estimator suggested by Gregoire and Valentine (2008, p. 259). The equation above could then be used as alternative to estimate the variance of the average VBAR in Goodman's variance formula (6) instead of the more traditional formula (7) for the estimated variance of the average VBAR. Thus this formula is equivalent to equation (12) in Gove et al (2020) and was included in the simulations presented there. Those simulations are replicated here in order to compare them to the results from a new equation described below and termed the point-wise Delta method.

Bias in the big BAF estimator
The big BAF estimator has been described above and in Gregoire and Valentine (2008, equation 8.33) as based on three sample means. Two of the sample meanŝ B c adnB v provide design-based estimates of basal area per acre while the thirdV v provides a design-based estimate of volume per hectare, albeit with typically with a high variance. However their combination forms the big BAF estimator with a much lower variance but which is not design-unbiased. Here we provide equations giving a simple approximation for the bias as well as an exact upper bound to the bias associated with big BAF sampling.

Approximate bias
In Appendix equation (A.11) we use a bias approximation formula from Seber (1982, p. 7) to derive the following approximation for the bias in big BAF sampling: Note that all the quantities in the bias expression above are population constants with respect to changing sample size except the sample size n. Thus as n goes to infinity the above expression for bias goes to zero. Bias approaching zero with increasing samples size on the order of 1 n is similar to the behavior of the standard ratio estimator according to Cochran (1977, p. 160). Note that if the difference between the large basal area factor BAF c and the small basal area factor BAF v is small the covariance betweenB cs andB vs approaches the variance forB vs so that the first two terms approach cancellation and similarly for the last two terms, so that bias will be also be lessened as the difference between basal area factors BAF c and BAF v becomes smaller. For a given sample size the bias will also be smaller for forests with high levels of basal area B than for forests with low levels of basal area. This bias expression is very similar to the bias that would be obtained from equation 11 of Palley and Horwitz (1961) for the Bell and Alexander (1957) estimator which can also be expressed as the ratio of two HPS sample means divided by a third sample mean. An important difference is that there are two point-wise sample sizes in the Bell and Alexander (1957) estimator, one being a point-wise subsample. Therefore some of the variances and covariances for the Palley and Horwitz (1961) bias formula and variance estimator of the Bell and Alexander (1957) volume estimator are based on the smaller subsample size while others are based on the total sample size.

Exact bias
In the Appendix an expression for the exact bias in big BAF sampling equarion (A.15) is derived based on methods used by Hartley and Ross (1954) to find the exact bias of the standard ratio estimator (also see Cochran, 1977, p. 162) This formula also seems to indicate that the bias will tend to be smaller in stands having higher basal area. Again equation (A.21) was derived in the Appendix following the methods of Hartley and Ross (1954) resulting in the following upper bound on the absolute relative bias in the big BAF estimator (also see Cochran, 1977, p. 162): This formula indicates that the bias relative to the standard error of the big BAF estimator approaches zero as sample size n becomes large, on the order of 1 √ n . This is also the case for the standard ratio estimator according to Cochran (1977, p. 160).
The Delta method for big BAF variance based on three sample means Previous approaches to variance estimation for big BAF sampling view the estimator as the product of two random variables, the count basal area per hectare and the mean volume basal area ratio. As indicated above, these approaches have assumed that the covariance between count basal area per hectare and the mean volume basal area ratio VBAR is negligible. However, if we do not wish to make that assumption, an alternative is to use the Delta method (e.g. Kendall and Stuart, 1977, p. 247, equation 10.12), to approximate the variance of the big BAF estimator (12) in the form presented by Gregoire and Valentine (2008, equation 8.33) indicated above as a function of three sample means. On the basis of a Taylor series, the Delta method approximates the variance of a function g(θ) of random variables θ = (θ 1 , θ 2 , . . . , θ n ). Now, since the population parameters are generally unknown, the unbiased estimators,θ are used; viz., or, assuming independence. . .
In addition in typical applications it is necessary to estimate the variance and covariance terms. In this section we will assume without loss of generality that A = 1. Let us define the function g in the formula for the Delta method as follows: The Delta method requires the following three partial derivatives: Applying the Delta method and substituting estimates for variances, covariances and means we then have: The estimated variance of the volume per a unit area based on the large BAF angle gauge alone is which is the total volume at sample point s and The estimated variance for the basal area per hectare based on the large angle whereB vs = m s F v is the basal area per hectare at point s with the large basal area factor BAF v . The estimated variance var B c for the basal area per hectare based on the small angle gauge F c is given by equation (8). Now since equation (24) utilizes covariance terms, we present the computational formulas for these. Recall the relationship between the sample covariance and the estimated covariance between sample means based on n independent samples is: Using this relationship the estimated covariance betweenB c andV v is: and the estimated covariance betweenB v andB c is: As is shown in Supplementary Materials equations S.3-S.7 variance estimator (24) can also be derived as a special case of an estimator presented by Hansen et al (1953, equation 9.6, p. 512-514) for the variance of a ratio between the product of k random variables and the product of p − k random variables (also Wolter, 2007, equation 6.4.1, p. 233-234). The variance equation can be simplified by noting that the two basal area esti-matorsB v andB c have the same expected value B and the variance ofB c is likely to be smaller because it is based on the smaller BAF c which selects more trees per point. In the original true variance approximation the coefficients multiplied by variances and covariances are functions of parameters which we must estimate when obtaining the approximate variance estimator. This justifies substitution of B c forB v in the variance formula above because they have the same expectation. Making this substitution and factoring out sample size n the variance formula can be simplified to: We have derived this variance estimation formula under the assumption that A = 1 so the variance estimate for an entire tract of area A can be obtained by multiplying by A 2 or alternatively expressingV v in total tract units rather than as per hectare. Note that because we have factored out a quantity of 1 n the estimator above is a function of the sample variances and covariances among sample point HPS estimates.
The variance estimator above is very similar to equation (12) of Palley and Horwitz (1961) which they obtained for the Bell and Alexander (1957) estimator which was essentially double sampling with a ratio estimator. However an important difference is that the Bell and Alexander (1957) estimator consists of a large sample of points on which basal area counts are made and a subsample of points on which tree volumes are also determined. By contrast for big BAF sampling the volume subsample is made on every point so there is no smaller point-wise sample. Therefore some of the variances and covariances for the Palley and Horwitz (1961) variance estimator of the Bell and Alexander (1957) volume estimator must be determined on the subsample which is smaller than n, but for big BAF sampling all the variances and covariances have the same point-wise sample size of n. A consequence is that for the big BAF variance estimator we cannot further simplify the variance estimator above by utilizing the ratio of large-to-small point-wise sample size as was done by Palley and Horwitz (1961). It should also be noted that technically the formula above is an estimator of the mean squared error, but because we will show in the Results section that the bias is extremely small we will continue to refer to it as a variance estimator.

Simulation trials
We used two simulated forest populations that were previously employed by Gove et al (2020) to compare traditional and previously proposed big BAF sampling variance estimators. The sampling simulation program sampSurf Gove (2012) which was written in R (R Core Team, 2021) was used to conduct the simulations. The concept of "sampling surface" (Williams, 2001a,b) was used to construct the samp-Surf simulator in which a raster tract of area A is tessellated into square grid cells. Trees are located on the tract and inclusion zones are established for each tree based on the sampling procedure (horizontal point sampling for these simulations). A sample point is considered to be located in the center of each grid cell. Totals for each grid cell are based on the attributes of trees whose inclusion zones contain the sample point at the grid cell center. The sampling surface is developed based on the total attributes values over all the grid cells. For our simulations square tracts were used with grid cells 1 m 2 in size.
Nine sets of simulations were conducted using every combination of BAF pairs (F v and F c ) where F c ∈ {3, 4, 5} and F v ∈ {10, 20, 30} for both forest populations. For each sampling simulation sampling surfaces were developed for total basal area and total volume using every combination of F c and F v resulting in 36 simulation surfaces. A Monte Carlo experiment was conducted for each of the 9 BAF factor combinations in which random samples of n = 10, 25, 50 and 100 were drawn with 1,000 replications. Summary statistics were computed for HPS and big BAF sampling for each sample on each sampling surface. The statistics available for each BAF combination made it possible to compare the big BAF results to an HPS sample using F c in which every sample tree was measured for volume (e.g., DBH and height measured).

Mixed northern hardwood population
The mixed northern hardwood population is the same one used by Gove et al (2020). The population is artificially constructed but resembles what could typically be found in a mixed northern hardwoods forest. It is established on a tract having a area of A = 3.17 ha and containing 31,684 grid cells. The tract is bounded by an external buffer 18 m wide so that the portion of the tract containing the tree population internal to the buffer has an area of 2 ha. A population of m = 667 trees with a total basal area of 48.4 m 2 was established within the tract boundaries. This is approximately equivalent to 333 trees/ha and a basal area of 24.2 m 2 /ha with a stand quadratic mean diameter ofD q = 30.3 cm. According to northern hardwoods stocking guides by Leak et al (2014) the stand would be in fully stocked condition. A three-parameter Weibull distribution (Bailey and Dell, 1973) was used to assign tree DBHs, with location, scale and shape parameters respectively being α=10 cm, γ = 2 and ζ = 30 cm. Total heights for each tree in the simulated northern hardwoods stand were assigned using the all-species DBH-height equation by Fast and Ducey (2011) for northern hardwoods in New Hampshire converted to metric units. A normal random error term with mean zero and standard deviation 2.5 m was added to each height prediction. A spatial inhibition process (Venables and Ripley, 2002, p. 434) with an inhibition distance of 3 m was used to assign trees to spatial locations within the simulated northern hardwoods forest tract. The method of Masuyama (1953) for boundary overlap correction was used in which tree inclusion zones were allowed to overlap into the buffer region (Gregoire and Valentine, 2008, p. 224). Because random sample points can fall anywhere in the tract which includes the buffer region, each tree has a complete inclusion zone.
The following taper function is used within the sampSurf simulation (Van Deusen, 1990): where D u is the top diameter at tree stem height h, D b is the tree stem butt diameter and 0 ≤ h ≤ H is tree height. The value of the taper parameter r was randomly selectied for each tree from the range r ∈ [1.5, 3]. With the taper function above a neiloidal form results if 0 < r < 2, a cone if r = 2 and a paraboic form if r > 2. The taper function for each tree was used to compute individual tree volume according to the procedures of Gove (2011a, p. 8). There was a correlation coefficient ρ(V, b) = 0.62 between individual tree VBAR and basal area in the simulated northern hardwoods population. Figure  Eastern white pine population The eastern white pine (Pinus strobus L.) population used by Gove et al (2020) was also used in this study. Gove et al (2000) describes data collection for the eastern white pine based on Barr & Stroud FP-12 dendrometry over a 20-year period. These data were obtained from pure even-aged white pine forest stands in southern New Hampshire. Data processing utilized the R Dendrometry package (Gove, 2011a). The white pine population used for simulations consists of m = 316 white pine trees with multiple measurements on some during the period. Trees were located within a 1 ha tract having an 18 m wide buffer and having a total area of A = 1.85 ha in size with 18,496 grid cells. The population has a basal area of 47.2 m 2 and a quadratic mean DBH ofD q = 43.6 cm. According to the Leak and Lamson (1999) white pine stocking guide the tract is solidly in the full stocking range. The trees were originally measured in several different stands without location information.
To assign trees spatial locations for the simulation stand, a spatial inhibition process having an inhibition distance of 3 m was employed similarly to the northern hardwoods stand discussed above. As with the northern hardwoods stand, Mayasuma's method was used to correct for boundary overlap in point sampling, so that randomly located sample points were permitted to fall into the buffer strip surrounding the 1 ha white pine tract. No taper function was required for the white pine stand because dendrometry measurements were available for upper-stem taper on each tree. As described by Gove (2011b) a cubic spline was fitted to tree dendrometry measurements. Smalian's formula (Kershaw et al (2016, p. 241)) was used to calculate individual tree volumes. Figure

Big BAF estimator bias
We derived two expressions for the bias in big BAF sampling, equation (15) which is an approximation to the bias and (16) which is an exact expression of the bias. An indication of the bias in big BAF sampling is shown in Figure 1 which was prepared using equation (15) with variance and covariance values computed from all the lattice points on the sampling surfaces for the northern hardwoods and white pine populations. Use of all the lattice points in these computations should provide a very close approximation to true population values. Only sampling plans with F c = 3 are presented because results from the other values of BAF c used in this study are very similar and the bias with F c = 3 is larger than other values of BAF c used in this study by a very small amount. Figure 1 shows that the bias is quite small even for n = 10 which is 0.17% for the white pine population. Bias percentages decline steeply with increasing sample sizes and are essentially negligible for all sample sizes equal to or greater than 10 for both the white pine and northern hardwoods population. As expected bias also declines as BAF v approaches the value of F c = 3.

Population sampling surfaces
The results concerning the sampling surfaces for the Northern Hardwoods and the White Pine populations were given by Gove et al (2020) in Table 1 of that paper. As expected the results for basal area and volume from the sampling surfaces were quite close to the actual population values. The white pine population had higher stocking and volume per hectare than the northern hardwoods population as would be typical for fully-stocked stands in the New England, USA region. Gove et al (2020) noted that the northern hardwoods DBH distribution was more positively skewed than the white pine DBH distribution. Tree heights in the white pine populations were generally taller than the northern hardwoods population which was likely the primary reason that the volume per tree in the white pine was considerably greater than that for the northern hardwoods population. Figure S.1 in Supplementary Material shows sampling surfaces for the northern hardwoods population for (a) total BAF c with F c = 3 basal area and (b) total BAF c volume with F v = 30. A realization of a Monte Carlo sample consisting of n = 100 is also indicated with each point denoted by a red "×". Figure S.2 in Supplementary Materials indicates the population correlations ρ for the northern hardwoods population between important variables such as the basal area and volume on the count and volume points and the volume for all 6 combinations of point sampling BAFs used in the simulations. As might be expected the correlation between basal area and volume when using the same BAF factor is close to 1 and fairly constant over variation in the range of BAF v for big BAF sampling. As also might be expected the correlation between two variables when one is sampled on BAF c and the other is sampled on BAF v declines with increasing count BAF c and ranges from 0.83 to 0.52. Covariances between these variables are displayed on Figure Figure 2 displays the standard error results from Monte Carlo simulations comparing Goodman's Method (equation (6)), the "traditional" Delta method (equation (11)), the new point-based Delta method (equation (24)) and simplified point-based Delta method (equation (33)) for the northern hardwoods population. For total sample size of n = 100 the standard errors of the four methods are virtually indistinguishable for all three count BAFs, F c = 3, F c = 4 and F c = 5. As expected standard errors decline for all four variance approximation methods as the BAF v declines and more closely approaches the value of BAF c . For reference purposes the standard errors for HPS using all trees selected on the count BAF c are displayed. In Supplementary Material it is indicated in Figure S.7 that this HPS standard error is extremely close to the standard error among big BAF simulation estimates for all values of BAF c and BAF v utilized in this study. Because these were so extremely close we present the HPS standard errors only for comparison to the results for big BAF sampling in Figure 2 and Figure 3. It is a remarkable fact that HPS with measurement of all trees for volume determination provides no appreciable reduction in variance in volume estimation compared with big BAF in which only a much reduced subsample of trees are measured at each point.

Monte Carlo Simulations Standard Error Comparisons
As the total point-wise sample size n decrease from n = 100 to n = 10 separations between the standard errors given by the four approximations become more evident with Goodman's method being slightly below the Delta method and the point-based Delta method standard error being lower than either of the other three approximations particularly for n = 10 and the largest value of BAF v . The simplified point-based Delta method is consistently lower than the other three methods. This makes it closer to the reference line indicated for HPS except for n = 10 where simplified point-based Delta method underestimates compared to the reference line for the two smaller values of BAF v . However it should be noted that even in the latter case the difference between the point-based Delta method and simplified point-based Delta method compared to the traditional Delta method is only in the range of 5% even for n = 10 and the largest value of BAF v . In this latter case both point-based Delta method and simplified point-based Delta method are within 2% of the reference line for HPS with point-based Delta method overestimating and simplified point-based Delta method underestimating for the two largest values of BAF v . For samples sizes equal to or larger than n = 25 which are more likely to be representative of typical big BAF sampling plans all four variance estimators are consistently within 2% of each other becoming closer as samples size n becomes larger. Figure 3 contains the white pine population Monte Carlo simulation standard error results. As was the case for the northern hardwoods population, the traditional Delta method (equation (11)), Goodman's method (equation (6)), the point-based Delta method (equation (24)) and the simplified point-based Delta method (equation (33)) are displayed in the figure. However, the maximum difference between the standard errors of the three methods was about 8% for n = 10 with differences at n = 100 being only about 0.5%. In the case of n = 10 both point-based Delta method and simplified point-based Delta method were within approximately 2% of the HPS reference line but point-based Delta method was an overestimate while simplified point-based Delta method was an underestimate. Consistently the lowest estimate of standard error was provided by simplified point-based Delta method which made it closer to the HPS reference line than the other methods except in the case of n = 10 where it was somewhat lower than the HPS reference line. As expected, standard errors for each given level of n and BAF c mostly decline with decreasing levels of BAF v though there are some slight exceptions in the case of point-based Delta method for n = 10 These trends are similar to those for the northern hardwoods population. However, the standard errors from the northern hardwoods population ranged from about 32% for n = 100 to 100% for n = 10 ( Figure 2) while the standard errors associated with the white pine population were substantially greater ranging from approximately 50% for n = 100 to 160% for n = 10 ( Figure 3). Figure 4 depicts the confidence interval capture rates on the northern hardwoods population for the 1,000 replications of Monte Carlo simulation for big BAF standard error estimates obtained using the traditional Delta method (11), Goodman's method (6), the point-based Delta method (24) and the simplified point-based Delta method (33). The results in this figure are based on the percentage of simulation trials in which a 95% confidence interval for total volume from the big BAF trial contains the true mean total volume of the simulated population. Thus a capture rate of 95% would be ideal. For n = 100 and n = 50 the capture rates for the traditional Delta method Goodman's method the point-based Delta method and the simplified point-based Delta method are very close for all values of BAF c and BAF v all ranging between 94.1% and 95.4%. In some cases such as n = 100 with F c = 3 and F c = 4 the big BAF capture rates are even closer to 95% than the capture rate for HPS with the given value of BAF c . For n = 25, F c = 3 and F v = 10, the point-based Delta method and the simplified point-based Delta method have a capture rate slightly lower than the other two big BAF standard error estimators but all capture rates are between 93.5% and 95%. All capture rates were between 93.5% and 95.3% for n = 10 with the point-based Delta method and the simplified point-based Delta method being lower than the other two big BAF methods for F v = 30 and the simplified point-based Delta method being somewhat lower for all values of BAF v otherwise the three methods were extremely close. In summary the four big BAF standard error estimates produced confidence interval capture rates that were very similar as might be expected from the fact that they produced vary similar variance estimates as indicated by Figure 2. Figure 5 displays the confidence interval capture rates for the white pine population using confidence intervals constructed with standard errors based on the traditional Delta method (equation (11)) Goodman's method (equation (6)) the point-based Delta method (equation (24)) and the simplified point-based Delta method ((33). Similarly to the northern hardwoods population, simulations with the white pine population produced confidence interval capture rates between 91.6% and 96% for all combinations of F c = 3, 4 and 5, F v = 10, 20 and 30, and sample sizes n = 10, 25, 50, and 100. As well the capture rates for a conventional HPS with BAF c are very similar to the capture rates with the big BAF approaches. For n = 100 the three big BAF capture rates were all very close to each other and between 93.7% and 94.8%. It is perhaps surprising that for n = 100 the largest deviation from the ideal capture rate of 95% were the results for the conventional HPS with BAF c which was lowest among all 5 methods ranging from 93.2% to 94.5% for the case of BAF c = 5. For n = 100 the capture rates for the point-based Delta method and the simplified point-based Delta method were very slightly lower than those for the other two big BAF methods. For the n = 50 samples size the capture rates for the big BAF methods as well as the conventional HPS with BAF c were all between 94.7% and 96%. Again the capture rates for the point-based Delta method and the simplified point-based Delta method were slightly lower than for the other two big BAF methods which in this case made them slightly closer to 95%. Capture rates for the conventional HPS with BAF c were extremely close to 95% and sightly lower than the big BAF methods for F c = 3 and F c = 4 but for F c = 5 the HPS capture rates were slightly higher than those for the point-based Delta method and the simplified point-based Delta method and slightly lower than those associated with the other two big BAF methods. Similar results were obtained for n = 25 with the point-based Delta method and the simplified point-based Delta method having nearly equal or slightly lower capture rates than the other two big BAF methods and all capture rates being between 94.7% and 96%. Capture rates for conventional HPS with BAF c were slightly lower or nearly equal to the three big BAF methods for n = 25. In the case of n = 10 all capture rates ranged between 91.6% and 94.6% with the point-based Delta method and the simplified point-based Delta method once again trending somewhat lower than the other two big BAF methods. Estimated correlations between basal area obtained from BAF c and BAF v ρ(B cs , B vs ) and correlations between BAF c and volume estimatesρ(B cs , V vs ) were very close and within each sample size declined with increasing values of BAF v . These estimated correlations are generally somewhat higher for the white pine populations for a given value of n, BAF c and BAF v . For the northern hardwoods population estimated correlations range between 0.83 and 0.52 and decline with increasing BAF v . As indicated estimated correlations in the white pine simulations tended to be higher than for the northern hardwoods population and ranged between 0.91 to 0.67.

Discussion
Results from inspection of Figure 1 indicated that the bias in the big BAF estimator is quite low for both the northern hardwoods and the white pine populations used in this study. For larger values of n the bias approaches zero and is especially low for the northern hardwoods population. The great majority of big BAF forest sampling plans in practice would be expected to have 10 or more sample points. Perhaps an exception might be a small stratum in a stratified sample design. While it would be possible to estimate bias from sample statistics using equation (15) it should not be necessary given the very low values of bias obtained in this example, especially for the larger values of n that are commonly used in practical applications of big BAF sampling. Furthermore we are not aware of any instances reported in the literature of big BAF sampling in which bias has presented problems for practical applications.
It should be noted that compared to the traditional Delta method (11), Goodman's method (6) contains a negative term which causes it to be smaller than the traditional Delta method although the difference is quite modest according to figures 2 and 3. Both the traditional Delta method and Goodman's method omit covariance terms between variables used in the estimation process while the point-based Delta method (24) and the simplified point-based Delta method (33) do account for covariances. This may be the reason that the point-based Delta method and the simplified point-based Delta method standard error estimates are somewhat smaller for the smaller sample sizes (especially n = 10) in Figures 2 and 3. Gove et al (2020) have previously compared the traditional Delta method equation (11) to Goodman's equation (6) with the same results presented here for these two variance estimation methods. However they did not include the point-based Delta method equation (24) or the simplified point-based Delta method (33) in their simulations so simulations of the traditional Delta method and Goodman's equation were included in the present study so that the performance of the point-based Delta method and the simplified point-based Delta method could be compared to those previously-developed variance estimators.
The impact of the correlations discussed above which is included in the pointbased Delta method and the simplified point-based Delta method but neglected in the two traditional variance estimation methods was apparently small for the simulated populations tested here. However, it is conceivable that these correlations could have a larger effect in some natural populations. Because of the way the artificial tree populations were constructed for this article, the possible effects of local variations in stand density on tree dimensions and the tree DBH-height relationship were minimized. For some species density variations may affect the DBH-height relationship thus inducing some degree of correlation between volume per tree and basal area per hectare. For example suppose an even-aged natural pine stand has extreme density variations so that DBHs tend to be smaller in relation to height in locally dense areas but larger in relation to height in areas where density is substantially less. This could induce a negative correlation between basal area and volume per tree because higher basal area regions would have less volume per tree than lower basal area regions. From this point of view the point-based Delta method and the simplified point-based Delta method may be a more conservative approaches because they do account for covariances of the kind just discussed. As well the correlation terms present in the point-based Delta method (24) and the simplified point-based Delta method (33) may provide additional opportunities to investigate the effects of forest stand structure on big BAF variance.
The point-based Delta method and the simplified point-based Delta method result in a computational equations (24) and (33) which are longer and more complex than Goodman's equation (6) or the traditional Delta method (11). However the formulas for the point-based Delta method or the simplified point-based Delta method can easily be coded in programming languages such as R (R Core Team, 2021) as was done for the simulations reported here or in a spreadsheet. It is perhaps becoming rare to rely on "hand" calculations with data entered in calculators to perform the computations required for a forest inventory. Once the point-based Delta method or the simplified point-based Delta method has been coded in a programming language or a spreadsheet template, computations required for the method should not be a barrier to its use.
Instances in which the variances of the point-based Delta method and the simplified point-based Delta method were associated with lower confidence interval capture rates than Goodman's or the traditional Delta method are associated with simulation parameters for which the estimated standard error for the point-wise Delta method were lower than standard errors from the other two methods. As indicated above this may possibly be due to negative terms associated with covariances in the point-based Delta method equation (24) and the simplified point-based Delta method (33) which are not present in Goodman's formula equation (6) or the traditional Delta method equation (11) as they have traditionally been applied to the big BAF variance estimation problem. Inspection of equation (24) for the pointbased Delta method and (33) for the simplified point-based Delta method indicate that the last two terms in the equations will likely be negative because they contain covariances expected to be positive but which are multiplied by negative coefficients. These negative coefficients result from taking the partial derivative of a ratio with respect to the denominator in the ratio (equation (23)) as required by the Delta method. Intuitively, when the denominator in a ratio is positively correlated to a term in the numerator of the ratio, this tends to stabilize the variability in the ratio because large values in the numerator then tend to be matched by large values in the denominator. This intuition accords with negative terms in the variance formula associated with covariances between terms in the numerator and the denominator of the ratio for the big BAF estimator.
Acceptable results from confidence interval captures tends to confirm that the Delta method based on a first-order Taylor series approximation can provide good variance estimates for big BAF sampling. As indicated previously, Wolter (2007, p. 231) states that the first order approximation has frequently been found to be acceptable in practice. The point-based Delta method, the simplified point-based Delta method and the traditional Delta method tested here are based on first-order Taylor series expansions, but the traditional Delta method assumes that covariance terns are negligible.
Another technical advantage of the point-based Delta method and the simplified point-based Delta method is that these equations do not depend on the variation among trees within points. Thus there is no dependence on variance terms that implicitly assume that trees sampled within the same point are independent as equation (7) does. Actually the only independent random samples in big BAF sampling are the sample points themselves. Similarly Palley and Horwitz (1961, p. 60) noted when considering a traditional variance estimator based on Bell and Alexander (1957) for two-stage sampling in which a subset of the count basal area points are selected for volume measurements "There is some confusion here, since in point sampling the measurement we are concerned with attaches to points...rather than to trees." Technically trees sampled at the same point are correlated, although apparently this fact did not prevent accurate evaluations of variance with (7) in simulations. However use of the traditional big BAF variance estimation methods with the ratio variance estimate equation (14) instead of equation (7) also avoids the problem of estimating variance using possibly non-independent sample trees within the same points.
In comparing the point-based Delta method to the simplified point-based Delta method we recommend the use of simplified point-based Delta method (33) for applications. Inspection of figures 2 and 3 show that the simplified point-based Delta method was slightly closer to the HPS reference line that the other three variance estimation methods for sample sizes of n = 25 or greater. In the case of n = 10 simplified point-based Delta method was generally about equally distant from the HPS reference line as point-based Delta method but tended to be a slight underestimate instead of an overestimate. In practical applications the majority of big BAF sampling plans will likely have samples sizes of n > 10.
From a theoretical point of view the point-based Delta method should be preferred to Bruce's method because Bruce's method does not take into account possible correlations between the basal area estimate obtained from BAF c and the estimate of mean volume to basal area ratio. The point-based Delta method implicitly does this by accounting for the correlations between all basic basal area and volume estimates. In a very similar way Palley and Horwitz (1961, p. 60) used the Delta method to derive a "conceptually sounder" variance estimator that they recommend as an alternative to the Bell and Alexander (1957, p. 17) variance estimator for double sampling with a ratio estimate in the context of point sampling. The simplified point-based Delta method should be preferred to point-based Delta method because the estimate of total basal area using BAF c is more precise than the estimate of basal area using BAF v and therefore a better estimate of the total basal area B for use in the variance approximation formula for the simplified point-based Delta method.
A final thought concerning the efficacy of the point-based Delta method and the simplified point-based Delta method relates to the current practice of estimating covariances and correlations to assess the independence assumption for big BAF sampling. As noted in Gove et al (2020), past attempts at calculating these quantities have all been ad hoc due to the nature of differing 'sample support' between the VBAR estimates, which are tree-based, and the basal area estimates, which are point-based. The point-based Delta method and the simplified point-based Delta method solve this dilemma by determining covariances and correlations completely on a point-wise basis, yielding true estimates in each case rather than aggregating tree-wise attributes for comparison on a point-wise manner as in the traditional Delta method application.

Conclusions
New variance formulas for big BAF sampling have been derived and tested. They have been termed the point-based Delta method and the simplified point-based Delta method because they have been derived using the Delta method based on sources of variation among sample points. This approach takes the covariances among the variables in the big BAF sampling estimator into account. More traditional methods of estimating the big BAF sampling variance are based on the variances of variables comprising the big BAF estimator but do not take the covariances between these variables into account. Monte Carlo simulation experiments conducted on a northern hardwoods forest population and a white pine population indicated that the point-based Delta method and the simplified point-based Delta method performed comparably to two existing big BAF estimators. Estimates from the point-based Delta method and the simplified point-based Delta method were sometimes slightly lower than estimates from the other two methods in smaller sample sizes (numbers of sample points). This might be partially due to negative terms of modest magnitude associated covariances among variables which are not considered with the more traditional estimators. We have also shown mathematically that the bias in big BAF sampling approaches zero as sample size becomes large on the order of 1 n , behavior that is similar to the standard ratio of means estimator commonly used in survey sampling.

Appendix
On the bias in the big BAF estimator Approximate bias According to Seber (1982, p. 7) a second-order approximation to the bias in a function g of the means of random variables x i based on Talyor series is: Begining with the equations for the first partial derivatives (21), (22), and (23) we obtain the following required second partial and cross-partial derivatives. Note that the second partials of g with respect toB c andV c are zero so they are not given below. We assume without loss of generality in this section that tract area A = 1 so the results are on a per-unit area basis.
The true population bias is a function of expected values and population variances. Noting the expected values for the HPS sample means E B c = E B v = B and E V v = V the relevant second partial and cross partial derivatives evaluated at the mean vector θ = (B, B, V ) are: Substituting into (A.1) with algebraic rearrangement we obtain: The expression above is essentially the same as bias derived from Equation 11 in Palley and Horwitz (1961) for the Bell and Alexander (1957) estimate which is essentially the same as double sampling with a ratio estimator. However, in their case two different point-wise sample sizes were involved, the total sample size and a subsample size, which is not the case for the big BAF estimator. That means some of the formulas for the variances and covariances within Equation 11 of Palley and Horwitz (1961) would not be the same as for big BAF sampling because they would depend on the sub-sample size rather than the total sample size. Note thatB c ,B v andV v are means of independent identical HPS samples so that We note that this expression approaches zero as n becomes large on the order of 1 n which is similar to the behavior of the standard ratio estimator according to Cochran (1977, p. 160).

Exact bias
We can mathematically investigate the bias in the big BAF estimator by a method similar to that give by Cochran (1977, p. 162) and originally developed by Hartley and Ross (1954). In the derivations to simplify notation we assume without loss of generality area A = 1 because the final results are ratios that do not involve area. Consider the covariance between the big BAF estimatorV B and the basal area estimate from the small BAF factor angle gaugeB v Now becauseB v is known to be a design-unbiased HPS estimator of the true basal area B and by (12) we haveV BBv =V vBc the following results: Then by the definition of the covariance and the fact thatV v andB c are designunbiased HPS estimators (Palley and Horwitz, 1961) of total volume V and basal area B respectively we have E V vBc = cov V v ,B c + BV resulting in: Now we may quantify the exact bias as: Using the definition of the correlation coefficient ρ the absolute value of the bias is then For forest populations we generally expect that estimates of basal area and volume from common samples on the same populations would be positively correlated so that 0 ≤ ρ B c ,V v ≤ 1 and 0 ≤ ρ B v ,V B ≤ 1 so that the maximum value of the correlations is one. If the correlations are positive the maximum possible difference in the absolute value on right hand side of the equation above occurs when one of the terms is zero and the other is greater than zero we have: Because B and var(B c ) are constant population values, as n approaches infinity the right-hand side of the equation above goes to zero, with the result that the relative bias for the big BAF estimator approaches zero on the order of 1 √ n (using " Once again because B and var(B v ) are population constants the relative bias in the big BAF estimator approaches zero as n approaches infinity.
In the unusual case were one of the correlations between basal area and volume estimates may be negative, we can posit In this case as well the relative bias in the big BAF estimator approaches zero as n approaches infinity. Thus for all three possible cases the relative bias in the big BAF estimator approaches zero as the sample size approaches infinity on the order of 1 √ n . According to Cochran (1977, p. 160) this is similar to the behavior of the standard ratio estimator often used in sample surveys.

Abbreviations
BAF: Basal area factor BAFc: Count sample basal area factor BAFv: Volume sample basal area factor HPS: Horizontal point sampling VBAR: Volume to basal area ratio          Figure 5 The white pine Monte Carlo simulation results for confidence interval capture rates as the average over 1, 000 replications for each BAF pair and sample size with the Delta Method (dashed, ), Goodman's (dot-dashed, +), the point-based Delta method (dotted, ) and the simplified point-based Delta method (long-dash, ×). The reference line (solid, •) is the average Monte Carlo standard error for the BAFc HPS results.