A square-grid sampling support to reconcile systematicity and adaptivity in the periodic spatial survey of natural resources

doi:10.21203/rs.3.rs-1745991/v1

Download PDF

Research Article

A square-grid sampling support to reconcile systematicity and adaptivity in the periodic spatial survey of natural resources

https://doi.org/10.21203/rs.3.rs-1745991/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Spatially balanced sampling is the most efficient design for surveying continuous or spatial populations across space. The spatial sampling of large-scale surveys is mostly based on grids, whose properties drive, and potentially limit, the possibilities of building flexible samples. Conciliating spatial balance and flexibility remains difficult. In particular, periodicity causes high constraints to the sampling particularly when an increase in the frequency of the information delivery is sought. Sampling stratification of adaptive sampling intensity also conflicts the grid-based approach.

We show that square grids have geometric homothetic properties that enable to answer these needs by supporting nested hierarchical subgrid sets. These properties can be exploited to cope with both spatial flexibility in the sampling effort and spatio-temporal coordination of samples. Whereas some surveys seemingly do exploit these properties practically across the world, no formal development has been made available in the survey sampling literature across fields of applications.

Here we therefore define and demonstrate these properties, and show how they can be used to produce nested hierarchical grids compatible with multiple periodicity values of interest to natural monitoring, and with adapting sampling intensity across space and time. We also provide an original extension of this framework, intended to tune the sampling effort gradually while preserving spatial systematicity. We use the French National Forest Inventory survey to illustrate these properties and their use in a large-scale repeated inventory. We show the flexibility and diversity of sampling schemes that can be initiated with square grids and the limits of their use.

systematic sampling

spatial grid

periodicity

flexibility

natural populations

resource survey

Natural living populations are of often unknown size and location (Stevens and Olsen 2004), and can evolve dynamically across both space and time. These features make their surveying challenging and preclude the use of any direct sampling design, for instance based on predefined sampling rates, since neither the actual size of the population (its sampling frame) nor its location are known. Thus, most surveys of natural populations use an indirect spatial sampling design whereby sampling points are drawn from within the area of the territory studied, from which the variables of interest in the population are being measured. For populations showing spatial extent, e. g. crops or forests, these variables are measured in zones of predefined area surrounding the points (Williams 2001, Gregoire and Valentine 2008, Fattorini et al. 2009, Mandallaz 2013), which form the support of the sampling units. The spatial distribution of these sampling units obviously has a tremendous impact on the survey.

The spatial distribution of the sampling units has long received a lot of attention (e.g., Mahalanobis 1944 on field crop surveys). Roughly two main strategies can be implemented: either a spatially uniform random sampling, or a sampling framed by a spatial tessellation of regular polygons within which sampling units can be taken (see e. g. review of US spatial surveys in Olsen et al. 1999). In this second situation, units can be selected either uniformly from the tessellation (located at reference positions such as polygon center points or tessellation nodes) leading to a spatially systematic sampling, or at random within the polygons, constituting a tessellated stratified sampling. It has long been acknowledged that spreading more evenly the observations over the territory can translate into spreading more evenly over the target population (Mahalanobis 1944), hence rendering better estimates than random sampling (Madow and Madow 1944, Cochran 1959, Ripley 1991). Particularly, regular tessellations or spreading of the units avoid spatial voids (Christianson and Kaufman 2016).

The spatial regularity of the sampling unit distribution translates into an evenness of distances among these units and, equally, of the partition of the territory. It was described by Stevens and Olsen (2004) as the base principle of what they term spatially balanced sampling. Several studies have concluded that a regular spacing is optimal for a variety of populations (Quenouille 1949, Brown et al. 2015, Dunn and Harrison 1993, Christianson and Kaufman 2016). One of such balanced sampling is the spatially systematic random sampling (henceforth, systematic sampling) that implies using a grid-shaped frame that covers the entire studied territory when the population surveyed is not localized and which has been largely adopted for large-scale surveys (Martino and Fritz 2008, Strand 2017).

Grid sampling enables sampling two populations simultaneously: the cells when the grid is seen as a space tessellation, or the nodes when the grid is seen as a collection of point sampling units. The use of one or the other sampling frame depends on the nature of the target population: either a one-dimensional (e. g. point sampling of animal populations), or two-dimensional (typically, land cover or forests). More generally, systematic regular grids can be a support to more complex sampling, for instance with random assignments of cells partitioning, quadrant recursive partitioning (Stevens and Olsen 2004), latin squares (Borkowski 1999). Hence, systematic grids constitute a versatile support for many possible types of spatial samplings that stretch way beyond the simple systematic sampling. Very advanced sampling designs have been tailored to be both flexible and spatially balanced, such as the local pivotal method (Deville and Tillé 1998, Grafström et al. 2012). Yet these samplings can hardly accommodate repeated samples, and the periodicity remains a major constraint to large-scale surveys. But here also grid-based sampling is highly relevant to repeated sampling since it responds to the need to coordinate samples in both space and time (Van Deusen 2002), a fact that largely justifies their common use in large-scale surveys (Christianson and Kaufman 2016).

Sampling from grids however comes with known drawbacks, of which rigidity in the sampling is not the least. According to Stehman (2009), the “ability to accommodate a change in sample size at any step in the implementation of the design” is one of the “desirable sampling design criteria”. The rigidity inherent to the grids comes from the finite and fixed position of all grid nodes, which are all defined once a reference grid node and one direction of the grid are drawn. For instance, unequal inclusion probability becomes difficult to achieve using systematic sampling designs (Grafström et al. 2012). While this rigidity is exactly the expected behavior that ensures a regular spacing across the territory, it strongly reduces the possibilities to tune and adapt the sampling intensity across space and time. For instance, once the size of the grid is defined, increasing the number of units can be done by creating a new grid, or finer levels of a given grid. But, for repeated surveys, this adaptation directly conflicts with the necessity to keep samples balanced both in space and time. Indeed, the distribution of the samples should not only address spatial heterogeneity in the population, but also its temporal variations, when changes to the localization of the population occur. This is particularly true for forests, which have been displaying major trends in their area and growing stock for decades to centuries (Audinot et al. 2020, Bontemps et al. 2020).

In view of global environmental issues, monitoring requirements tend to prioritize a high temporal frequency in a variety of surveys of natural resources and the environment (Olsen et al. 1999 in the USA). The renewed interest for forest monitoring to support multifunctionality in Europe (UN EN 2021, Bontemps et al. 2022) not only stresses the interest for a periodic monitoring, as implemented by most National Forest Inventory (NFI) surveys (Tomppo et al. 2010), but also suggests to increase its temporal frequency, paving the way for enforcing monitoring systems operating at different time periods (van Deusen 2002). Thus, the trivial approach of measuring all sampling units from a grid sampling frame once over each monitoring period (still largely implemented in surveys such NFIs) may be abandoned in favor of a constant and systematic subsampling effort over each year of a period. The approach obviously facilitates field work allocation across time, as encountered in some national surveys of forests (Belgium, Poland, Finland; Vidal et al. 2016). Yet, it also intends to support the ambition of a more frequent inference (van Deusen 2000 in the US forest monitoring context). In this context, the central issue that arises is how to spatially distribute the sampling units at each measurement occasion within a monitoring period, for them to also form systematic spatial samples of lower size, useable for inference purposes, and to also grant some spatial flexibility in the sampling effort.

In this paper, we develop the formal geometric properties of a square grid that allows to address simultaneously regularity, periodicity and flexibility as a frame for constructing spatial samples, in a context where some surveys may have partly implemented such an approach (e .g. in the Polish NFI survey; Michalak and Zajaczkowski 2010)), but where a formalized description is still lacking in the literature to our best knowledge. Beyond heuristic developments encountered in existing spatial surveys, this theoretical approach to adaptive spatial and temporal sampling also intends to frame the set of possible options for a larger use in surveys envisioning both high-frequency and periodic reporting. Last, we develop an original application of grid-based sampling, whereby nested grids are shown to allow gradual tuning of the sampling effort while ensuring spatial quasi-systematicity, properties that recently emerged as keys to address future needs of forest ecosystems monitoring (Bontemps et al. 2022). The practical implementation of these fundaments is illustrated with a large-scale repeated survey also based on these concepts, namely the French National Forest Inventory.

Since we intend to focus on the sampling frame, we do not address the notions of multiple-degree sampling and of random local selection of sampling units within the neighborhood of a grid node (e.g., Ireland, Tomppo et al. 2010). For the sake of simplicity, we shall thus assume that grid nodes form the sampling units associated to this sampling frame.

2.1 Desired properties and constraints

Grid-based approaches are particularly useful for spatial populations that are not localized and that are irregularly spread over the territory studied (Stevens and Olson 1993). The grid projected on the territory has to be larger than the territory itself.

The properties desired for these grids are: (i) the ability to cover regularly the territory, regularly being understood as balance in the spatial distribution (Stevens and Olsen 2004) obtained by the regularity of sample locations; (ii) for repeated surveys, the ability to be divided into equal temporal fractions serving as annual sampling lists (fixed populations of units from which samples can be selected) that also cover regularly the territory (de Vries 1986). The number of fractions is driven by the periodicity of the survey (for instance 10 years, or 5 years, as is standard routine for international forestry reporting, UN 2020, Forest Europe 2020); such fractions are sometimes referred to as interpenetrating panels, or panels for short (Mahalanobis 1944, van Deusen 2000, 2001, 2002, Roesch 2007); (iii) the ability to tune the sampling intensity within each fraction, at controlled locations that ensure spatial balance.

Initially proposed by P.C. Mahalanobis (Greenwood 1946) for survey control and assessment of the overall estimation variance, interpenetrating sub-grid panels can also be used as a support for drawing successive samples, taken from the same initial grid, and can hence be termed fractions. Because they are spatially nested and regular, interpenetrating grid fractions were referred to as hierarchically nested samples by Stevens (1997). These grid fractions are complementary in that no nodes from the initial grid are left or overlayed when the panels are merged.

In his approach on crop surveys in Benghal, Mahalanobis (op. cit.) fractionated a grid into 2 sub-grids. While dividing a grid into two or four of such systematic sub-grids is rather obvious, dividing it into ten or five interpenetrating and complementary portions is far less obvious despite the immediate interest that a five- or ten-years periodicity has for international forest reporting. Examples of large-scale forest surveys implementing panel fractionation that do not meet these sampling periodicity constraints are the National Forest Inventories of Romania (4 years periodicity, Bouriaud et al. 2020) or Switzerland (9 years periodicity, Fischer and Traub 2019).

2.2 Systematicity and periodicity: the geometric foundations

A square grid is constituted of regularly spaced dots, the nodes of the grid, which can be defined by their coordinates along two orthogonal axes in the primary directions of the grid. In order to support systematicity (regularity in space) and periodicity (regularity in time) in sub-sampling, the complete initial grid, referred to as the base grid, needs be split into a predefined number of complementary subsets of nodes also forming square grids, one for each temporal occurrence of sampling.

Suppose the base square grid has a unit cell size (of whatever units, for instance 1 km), and let n be the sampling period, and subsequently the number of possible square subgrids that form an even partition of all the points of the base grid. Then, one subgrid has a number µ n^-1 nodes since we divide the total number of nodes by n. In the resulting subgrids, the distance between the closest sub-sampled nodes sampled is increased n folds (and equals n for a unit cell size).

In order to fractionate the base grid into n fractions such that all subgrids are also square (a necessary condition to maintain spatial balance), one needs to find triplets (n, a, b), where n is the number of fractions, and (a, b) some positive (with either a=0 or b=0) integers representing the projected distance n on the grid’s axes, and expressed in units of the reference grid (Fig. 1a) such that:

n = a² + b² (Eq. 1)

Any triplet of integers (n, a, b) that verifies this equation also verifies the sum of two squares theorem (an extension of Fermat’s theorem, see Supplementary Information 1). The triplet defines the number of subgrids (n) and the distance among the nodes of a subgrid along both axes (a and b, Figure 1).

The first 10 positive integers n that satisfy Eq. 1 are 1, 2, 4, 5, 8, 9, 10, 13, 16, 17. It therefore follows that, despite not being obvious, a square grid can be fractioned into n = 5 (5 = 1+2²) and n = 10 (10 = 1+3²) interpenetrating square subgrids of equal size, a fact that has an immediate interest as it matches common World (UN) / European (MCPFE) forest reporting periodicity. Figure 1b shows examples of a base grids constructed for a periodicity of 2, 5 or 10 years, and the resulting fractionation into homothetic square subgrids.

2.3 A first step toward flexibility: varying the sampling intensity on a discrete power-of-2 scale

By iterating the previous principles, preserving the space- and time-regularity of the sample distribution while varying the sampling intensity from either the base grid or any annual fraction (panel) translates into modulating the subsetting intensity of the grid nodes in such a way that the resulting cells remain square. Let m, c and d be positive integer values that verify Eq. 1, such that m = c² + d². Then n m also forms a sum of squares:

n m = (a² + b²) (c² + d²) = a²c²+ a²d²+ b²c²+ b²d² + 2 abcd – 2abcd = (ac+bd)² + (ad-bc)² (Eq. 2)

Thus, it is possible to form square interpenetrating hierarchical grids from a base (complete) grid long as Eq. 2 is verified. For instance, according to Eq. 1, prescribing m = 2 and c = d = 1 yields the smallest sampling effort adjustment that preserves properties of the square grid, resulting in a subgrid with elementary cell size of (2^0.5) n (smallest diagonal square cell of base grid in Fig 1), which corresponds to a 1/2 fraction of the initial cell population.

These interpenetrating divisions (also known as levels) define a sequential increase of cell size, which translates into sequential reduction of the grid node number (or equivalently, of cell number) by a factor m. They represent levels of spatial sampling intensity. If n = 1, the condition m = 2 simply results in dividing by two the grid density (Fig. 2b to 2d). Noticeably, 2 is the smallest value of m that results in a square grid. In addition, the division can be recursive: each subgrid can be divided again by 2, creating sequential levels of subsetting associated to a power-of-2 scale for the sampling effort. This suite of recursive divisions by 2 is apparent in that of possible integers n that verify Eq. 1.

The square grid fractionation being recursive, the first fractionation can be used to produce panels, in a number that addresses needs for periodicity, while the successive nested levels can be used to produce nested systematic grid node subsamples, for instance in order to reduce the sampling intensity locally or within a given stratum. Since these reductions in the grid node subsamples are multiples of m, more continuous modulations of the sampling intensity may be desired. Original alternatives based on the interpenetrating grid approach can be considered, that are described at section 3.3.

3.1 The sampling grid design of the French NFI

To illustrate with concrete examples the use of the interpenetrating grids, the French NFI survey is being presented here. It created its base (1 x 1 km) grid in 2004, as it changed its sampling design from a 10-year periodic to an annual sampling that was first presented in Vidal et al. (2005). It borrows to the design of the FIA forest inventory survey (Roesch and Reams 1999).

As the initial choice for the NFI was to meet the periodicity of 10 years, all the nodes of the grid being sampled within 10 years, 10 interpenetrating grids were devised. There are 8 pairs of solutions to Eq. 1 that meet this objective of fractionation into 10 panels: a= +/- 3 and b = +/- 1, and conversely, a= +/- 1 and b = +/- 3 (Figure 3). The divider that allows to reduce the grid density (i.e., the levelling factor) was set to m = 2: the number of cells is divided by two from one level to another, while the cell area doubles.

Of note, each sequence of 5 successive subgrids (1-5 and 6-10) ensures a regular systematic spatial cover (Figure 3), stressing that this grid also supports the 5-yr periodicity. It thus offers options for forest inference at annual, 5-year or 10-year periods, and brings flexibility in the reporting periodicity. Another interesting property of the square grid emerges when the fractionation is made for 5 panels: as a result of the balanced nesting, any node belonging to a given panel is surrounded by those of the four other panels (Figure 2b). In that situation, the time-periodicity is geographically translated into systematic patterns of five nodes as the central node of that repeated pattern also happens to be the mid-period node.

At the time when the sampling design was conceived, the sampling units obtained from the grid were meant to be measured only once. But in 2010, the choice was made to remeasure all sampling units after 5 years for accurate wood fluxes (e.g., mortality, harvests, recruitment) estimation. The initial grid was consequently split into two translated subgrids having each a periodicity of 5 years. The subsetting levels of the adjacent units from panels t and t+5 were consequently chosen to be identical (Hervé et al. 2016), ensuring that, at any given year and subsetting level, the units of panels t and t+5 would be adjacent, hence reducing time travel for field crews.

3.2 Using the grid approach for implementing spatial changes in the sampling intensity

3.2.1 Optimizing phase sampling

Double sampling for stratification is a popular and efficient sampling and estimation method, which implies using nested samples of different sizes (Rao 1973, Cochran 1977). Here, interpenetrating grids bring a decisive advantage whereby grid levels offer a direct support to the construction of nested balanced samples.

The French NFI brings an example of such two-phase sampling design based on grid levels, whereby the first phase is a large sample of photo-interpretation plots, while the second is a nested subsample of plots used to produce field measurements (de Vries 1986, Mandallaz 2007). The nested levels of the square grid offer a straightforward means of subsampling the units of the first phase: typically, the first phase is based on the finest grid level (with cells of 10 km² in the case of the French NFI) while the second phase is a based on higher levels with cells of 20 km² or 40 km², depending on the subsampling intensity desired.

The same principles can be used to accommodate further sampling phases (e.g., three phases in the Italian NFI, Tabacchi et al. 2005).

3.2.2 Stratified sampling with unequal probability

Grid levelling can also support a stratification-based variation of the sampling intensity within a given geographic domain, for samples drawn with unequal probability. This feature is used in the French NFI where the inclusion probability of field plots (second phase plots) varies according to the strategic significance of the vegetation type observed during the first phase photo-interpretation. Forest plots (forests being a category defined according to UN/FAO’s definitions) are hence subsampled with a probability of ½ (i.e., m=2) while other wooded lands (of lower importance) are sampled with a probability of ¼ (m=4), resulting in unequal probability sampling that preserves the spatial balance and all geometric properties of the samples.

Alongside this stratification, a geographic stratification is performed whereby lower levels are used across two geographic subdomains of the territory (Landes and Mediterranean region) which are known to have a smaller spatial variability and a lesser importance. The local reduction of the sampling intensity is a common practice in NFI surveys (Tomppo et al. 2010, Vidal et al. 2016, Bouriaud et al. 2020). The grid-based approach however ensures, beyond the spatial balance, the possibility to obtain a spatially systematic coverage of the entire territory at the lowest common density of several zones.

3.2.3 Disturbance-driven variations in the sampling intensity

Unanticipated deviations from a desired sampling may be dictated by the need to address a local survey of a forest windthrow or sanitary event (IFN 2009). Locally increasing the spatial sampling intensity while preserving the link with the initial sampling design, and some spatial balance, is possible by using the base grid at its original density. In this situation, levels can be used to construct the nested samples at varying sampling intensity. Of note, this forms an alternative to explicit disturbance-based designs (van Deusen 2000) which still has not received the attention it deserves.

In all surveys, the maximum possible sampling effort is however limited in practice by resources rather than statistical-related constraints. Increasing the intensity of the sampling in a given zone therefore needs to be compensated for by a decrease in other parts of the territory. The use of levels brings practical solutions that ensure the spatial and temporal coherence of the samples. The known location and number of potential units per grid panel and level enables an efficient planning, and allocation of resources.

3.3 Temporal variations in the sampling intensity: an original development for trading systematicity with adaptability

In annual surveys, unavoidable events (e.g., COVID-19 crisis of 2020 causing interruptions of the work) or unexpected budget-driven variations of the sampling effort across time may request changes in the sampling effort, challenging the initial sampling design. In the French NFI survey for instance, incremental reductions of the sampling effort have been implemented since 2012, with the second phase sampling effort being gradually decreased by 20% as compared to the initial one. These incremental reductions cannot be obtained from grid-levelling since switching from one level to the next reduces by 2 the number of nodes.

Nevertheless, and this is an original use of the present design, grid nesting also enables the design of a spatially balanced sample with a controlled and more continuous reduction of the sampling effort, by using high-level grids (low spatial density) as a means to exclude sampling units from lower levels (higher spatial density). The principle is to exclude units belonging to both a low level (ex. 40 km²) and a high level of the grid (ex. 320 km²). For instance, in 2020 the French NFI reduced by 14% its sampling effort for forests, which was obtained by subtracting the nodes also belonging to grid levels 5 and 8 from the level 2 grid, yielding a reduction of the sampling effort by: 1/2^(5-2) + 1/2^(8-2) = 9/64 ~ 0.14. This is illustrated in Figure 4.

While reducing the sampling intensity, the intersection of different levels of a grid leads to a spatially balanced pattern of units’ location since the location of the units not sampled is systematic. But, when multiple panels with different reduction rates are merged, the systematicity is however altered, and the regularity is more or less reduced (Figure 4).

To test the impact of sampling effort reduction effect on the spatial balance, a reduction of the sampling intensity was simulated over a virtual portion of grid of 150 x150 nodes (see details in Supplementary Information 2). To this end, the average distance to the four nearest neighboring points was computed for the inner 100x100 cells only, to avoid border effects.

Using higher levels to suppress nodes (Figure 5a) generated only a limited number of possible distances to the nearest neighbor remaining nodes (as also suggested in Fig. 4 low right, and Supp. Fig 1), with a distribution of distances across the simulation area that varies slowly according to the level of reduction. While reducing the number of nodes, some neighboring nodes remain at the original distance of the panel grid, but increasingly less as nodes are being removed, down to the moment when no more node is at that original distance (Supp Fig. 1). Once reached this limit, which corresponds to removing one every two nodes, all nodes are at distance in order of √2 larger than on the original panel grid, hence the upper limit at √2 visible in Figure 5a.

The standard deviation of the mean distance between neighboring points revealed a maximum spatial heterogeneity at sampling reduction intensity of 31%, reaching around +10% (Figure 5b). This level of reduction is critical, as one cannot expect to produce it with alternative options easily (sub-griding in 4 panels would lead to reduction by 20%, sub-sampling on a next homothetic level would yield a 50% reduction). Also, this standard deviation was found to locally saturate along this gradient (e.g., 7-12%, or 20-25% reduction intensity) suggesting that small reductions (in the range of what they are currently implemented to the yearly adjustments of the sampling effort) would not cause substantial distortions to the spatial balance of the samples. For greater reductions rates of the sampling intensity (25-45%), greater values of the standard deviation of the distance among points suggested that the variability in distances among points was critical and does not support the hypothesis of a spatial balance. Between 40% and 50%, the evolution is less regular (smooth). This is due to the limited number of sub-grids with high levels that can be removed. A simulation based on a bigger initial grid would lead to a more detailed and precise evolution without modifying the illustrated trends. In such configurations, resizing the initial base grid may form a logical option.

Grids represent a practical and efficient support to spatial sampling of spatial populations, being amply used in the design of natural resource surveys or ecological studies (Birch et al. 2007, Strand 2017). Since they enable to cover the entire territory studied with a controlled and regular dispersion of the sampling units (Brown et al. 2015), they are particularly useful in the situation of non-localized populations (Cochran 1977, Christianson and Kaufman 2016), of continuous populations (Stevens and Olsen 2004), or to monitor the dynamic of the population when the sampling is repeated in time. But since the frame constituted by the grid nodes or the cells constitutes the frame from which the samples that can be drawn, it also restrains the temporal and spatial adaptability of the sampling. As shown here, square sampling grids with homothetic properties enable to construct a sampling support compatible with adaptability and a range of periodicity. Many spatial surveys of natural resources, while ensuring a constant sampling effort over time, are also periodic (Olsen et al. 1999, Vidal et al. 2016). The need to report on a higher frequency has also been put forward in recent debates (e.g., Bontemps et al. 2022) and thereby underlines the need to ensure systematicity in the sampling at different temporal resolutions. Besides, a certain degree of adaptability of the sampling effort across space and time is highly desirable for repeated or continuous surveys, for the optimization of their sampling or for practical reasons related to the survey environment (de Gruijter et al. 2006). The literature in the field shows that, while different surveys have implemented systematic grid designs able to cope with these aspects, a theoretical foundation was not available in the literature.

1) On the role of nested grids for multiple-phase and/or stratified sampling

Adaptability is widely seen as a very desirable and as a recommended feature for a sampling design (de Gruijter et al. 2006, pp 35–36). For instance, many ecological processes are scale-specific, and local variations in the spatial sampling intensity may be desirable. The use of a systematic sampling grid, since it defines at once the location and the numbers of all the units, could therefore be seen at first as a strong limitation. However as shown here, hierarchical nested grids are able to change the resolution of a survey to adapt the sampling intensity to special needs (White et al. 1992, Birch et al. 2007). They are also a support to organize successive hierarchical samplings in multi-phased sampling designs as shown in the French NFI case. We surmise that these grids will receive more attention in the future, with the increased diversification of the target populations and observational variables, echoing increased preoccupations at EU levels (Ferretti 2021, Bontemps et al. 2022). In particular, the need for biodiversity monitoring in forest rises simultaneously two challenges: i) that repeated but coordinated samples need to be carried out to observe population dynamics and ii) that all observations may not be realized with the same spatial intensity, some being too expensive to be carried out on a dense network of field plots.

The levels and their numerous possible combinations offer a means to fine tune the subsampling intensity while preserving a high level of spatial regularity. Thus, square grids constitute a very interesting solution to the recent monitoring needs, whereby different variables need to be surveyed at inherently different spatial or temporal intensities.

2) On the role of nested grids for coping with survey periodicity

Interpenetrating grid subsamples were suggested as sampling support for large-scale surveys as far back as 1940s by Mahalanobis P.C. (Greenwood 1946, Roy and Singh 1973), with the initial intention to establish control samples (2-subgrid split). They were subsequently used for the Forest Inventory and Analysis (FIA) Program of the U.S. Forest Service as a mean to produce repeated annual estimates over the same territory (Van Deusen 2001), in the original context of supporting a continuous monitoring (van Deusen 2002). The need to combine the forest inventory to the Forest Health Monitoring program (FHM) prompted the use of FHM’s triangular grid to produce multiple samples with decreasing intensity from nested panels and support annual estimations based on interpenetrating samples following White et al. (1992). Periodicity and spatial balance were at the heart of the development of the FHM grid and subsequent sampling strategy. However, these grids are not seen as adaptive, and produce fixed-sized samples for each panel. Consequently, the sampling methods used from these grids restrained the possible periodicities. We have shown that this difficulty can be circumvented, particularly in the case of the square grid because of its higher number of possible values, and because it naturally supports key periodic values such as 5 and 10 years.

Features of the square grid and the ability to accommodate these apparently conflicting needs of systematicity and adaptivity, even over 5 years period which is key in monitoring, are not unique to the square grid. Triangular grids indeed offer similar properties, but with less possible periods. In forest inventory, a very common though not exclusive prior choice in the design relies on a regular tessellation (grid) of the space (White et al. 1992). Some variations are encountered at that step, with designs not based upon regular polygons, and use rectangles (e.g., some part of the Chinese territory) or non-equilateral triangles (e.g., Chile) and rhombus (e.g., Romanian NFI, Bouriaud et al. 2020). With regular tessellations, three options (square, regular triangle and hexagon) are possible, such as the hexagonal tessellation implemented in the FIA forest inventory of USA (Reams et al. 2005). Hexagonal grids are currently used as well (Birch et al. 2007), and are able to support hierarchical nested levels. Their inherent periodicity is however much more restrained than that of square grids.

3) Adapting the sampling effort though systematic sampling alleviation: an original contribution

Levels, defined as successive nested interpenetrating subsamples, are key to the need for adapting the sampling effort. But the reduction in sample size from one level to the other can be too large, being a fixed property of the grid at its creation. However, the use of higher levels to systematically subtract sampling units from the population of nodes/cells of lower levels, as shown here, represents a very dependable and original solution, which can accommodate a very large range of intensity variations without a strong loss of the spatial regularity (Fig. 5). Thus, our results showed that it preserves a high level of spatial balance. This strategy brings an operational solution to the question of adaptability, which may not be as relevant to short term surveys as it is to monitoring (de Gruijter et al. 2006). Since anticipating changes to the design is not always possible, a sampling frame that offers prior flexibility is therefore more appreciable.

A major challenge in designing a repeated survey is to cope with possible requirements for changing the sampling effort, induced by either cost-driven or environmental reasons (Overton and Stehman 1996). Here, two alternatives should be examined: either maintaining a pluri-annual nested sampling frame such as the one defined by sampling grids, or opting for a full annual renewal of the base grid, at a target sampling intensity defined at each sampling occasion. This echoes de Gruijter’s et al. 2006 recommendation to “avoid undue complexity” when the sampling context is uncertain or highly dynamic, and requests careful scrutiny on the sampling context and issues. In practice however, units previously sampled are sometimes used to measure and infer changes in some key parameters, the persistence of which imposes the temporal structuration that, again, favor the use of a grid.

4) On the role of grids in sampling

Large-scale surveys, particularly national forest inventories, implement exclusively design-based estimations, hence relying solely on the randomness generated while creating the samples (Mandallaz 2007). The sampling grids are strongly rooted in the forest inventory methods, and support the typical need of these surveys to produce areal estimates. The observational grids can indeed be used with the purpose of producing point-level (plot-level) observations with no intention to produce estimates on the territory that the grid is covering. In this situation, according to de Gruijter et al. (2006) the universe is that of the observational plots, whereas in areal estimation, the universe is that of the territory for which inference is desired. Noticeably, the sampling grids address both purposes and the generality of the principles ascertain their suitability for a wide range of populations, i.e., not exclusively forest trees. The use of the grid further offers a straight means to define the statistical weight of sampling units as inversely proportional to the cell size. The regular shape and size of the cells result in fixed and constant weights, hence supporting sampling with equal probability.

Stratified sampling is another sampling design that has proved its efficiency in vegetation surveys. Here also, hierarchical grids represent a natural support for stratified sampling whereby the different levels are selected locally to meet the spatial sampling intensity required, while keeping the coherence with the rest of the territory (other strata). This is illustrated by variable sampling intensities implemented for the different vegetation types in the French NFI and in most of such surveys (Tomppo et al. 2010). Decreasing the sampling intensity is immediate. In the inverse situation, if the initial grid sampling was not dense enough, a new denser nested grid can be created locally thanks to the recursive properties of the square grid fractionation, based on the divider that defines the reduction factor from one level to another. Quantifying the impact of a local event that needs increased sampling intensity, such as a fire, or a windthrow disturbance, falls into this category.

5) On their limit to variance estimation

A major drawback of the spatially systematic sampling is the absence of unbiased estimation of the variance, and hence the sampling error (Gregoire and Valentine 2008, Strand 2017). Furthermore, spatial dependence can result in additional difficulties of variance estimation. Spatial autocorrelation in the parameters surveyed can however be estimated and some corrections to variance estimations were proposed (McGarvey et al. 2016, Magnussen and Fehrmann 2019). The French NFI is using an alternative view whereby cells are first selected from the panels depending on the prescribed grid level, ensuring that the dispersion and periodicity criterions are met, then a plot is drawn at random from within the selected cells. This sampling scheme is referred to as unaligned systematic sampling (Berry and Baker 1968, Overton and Stehman, 1993). It is meant to attenuate interactions with spatial patterns, which can coincide by chance with the grid’s periodicity (Cochran 1977) and has found frequent support (Stehman 1992). If multiple plots are drawn in each cell, cluster sampling or two-stage sampling can be implemented. Further, large-scale surveys tend to sample at spatial densities insufficient for this type of spatial autocorrelation to impact significantly the estimations in practice, though variance estimation will be biased anyhow (Fattorini et al. 2018).

The homothetic properties of square grids allow the creation of nested hierarchical sets of subgrids that support adaptive sampling effort across space and time while preserving spatial balance. There versatility is such that a great variety of sampling designs can be accommodated while preserving these properties, such as the stratified sampling, multiphase sampling.

The increasing need for regularly updated reporting calls for sampling designs able to cope with multiple periodicities (e.g., both annual and multi-annual). As shown, square grids can easily grant systematicity for periods of 2, 4, 5, 8, 9 or 10 years while also supporting annual estimations, depending on the reporting context.

Square grids, through the construction of hierarchically nested subgrids, offer practical solutions to periodicity and spatial balance of complex large-scale surveys.

Acknowledgements

The geometrical object that is the square sampling grid and its homothetic properties for decomposition across a ten-year and five-year period were discovered and implemented by Jean Wolsack and by Jean-Christophe Hervé. They designed and implemented the sampling operationally within the national forest inventory (2004 to 2017).

Funding Information

LIF is supported by a grant overseen by the French National Research Agency (ANR) as part of the “Investissements d’Avenir” program (ANR-11-LABX-0002-01, Lab of Excellence ARBRE).

Competing interest

The authors declare no competing interests and funding.

Statement and Declarations

All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript. The authors have no financial or proprietary interests in any material discussed in this article.

Authors ensure that all of the references are cited and all citations have a reference within the manuscript.

Data availability

This manuscript is based on simulation data and theoretical considerations. The data will be made available to any interested person under reasonable request, without undue reservation.

Author Contributions Statement

O.B., F.M. and J.D.B. wrote the main manuscript text. O.B. and F.M. prepared all figures. All authors reviewed the manuscript.

Audinot, T., Wernsdörfer, H., & Bontemps, J. D. (2020). Ancient forest statistics provide centennial perspective over the status and dynamics of forest area in France. Annals of Forest Science, 77(3), 1-24.
Berry, B. J., & Baker, A. M. (1968). Geographic sampling. Spatial analysis, 91, 100.
Birch, C. P., Oom, S. P., & Beecham, J. A. (2007). Rectangular and hexagonal grids used for observation, experiment and simulation in ecology. Ecological modelling, 206(3-4), 347-359.
Bontemps, J. D., Denardou, A., Hervé, J. C., Bir, J., & Dupouey, J. L. (2020). Unprecedented pluri-decennial increase in the growing stock of French forests is persistent and dominated by private broadleaved forests. Annals of Forest Science, 77(4), 1-20.
Bontemps, J. D., Bouriaud, O., Vega, C., & Bouriaud, L. (2022). Offering the appetite for the monitoring of European forests a diversified diet. Annals of Forest Science, 79(1), 1-9.
Bouriaud, O., Marin, G., Hervé, J.-C., Riedel, T., Lanz, A. (2020) Estimation Methods in the Romanian National Forest Inventory. Nova Science Publishers.
Brown, J.A., Robertson, B.L., Mc Donald, T. (2015). Spatially balanced sampling: applications to environmental surveys. Procedia Environ. Sci. 27, 6–9.
Borkowski, J. J. (1999). Network inclusion probabilities and Horvitz-Thompson estimation for adaptive simple Latin square sampling. Environmental and Ecological Statistics, 6(3), 291-311.
Christianson, D. S., & Kaufman, C. G. (2016). Effects of sample design and landscape features on a measure of environmental heterogeneity. Methods in Ecology and Evolution, 7(7), 770-782.
Cochran, W.G. (1977). Sampling Techniques, 3rd ed. New York: Wiley de Gruijter, J., Brus, D., Bierken, M., Knotters, M. (2006) Sampling for natural Resource monitoring, Springer, 326 p.
Deville J-C and Tillé Y. (1998). Unequal probability sampling with-out replacement through a splitting method. Biometrika 85, 89–101.
Dunn, R., & Harrison, A. R. (1993). Two‐dimensional systematic sampling of land use. Journal of the Royal Statistical Society: Series C (Applied Statistics), 42(4), 585-601.
Fattorini, L., Franceschi, S., & Pisani, C. (2009). A two-phase sampling strategy for large-scale forest carbon budgets. Journal of statistical planning and inference, 139(3), 1045-1055.
Fattorini, L., Gregoire, T. G., & Trentini, S. (2018). The use of calibration weighting for variance estimation under systematic sampling: Applications to forest cover assessment. Journal of Agricultural, Biological and Environmental Statistics, 23(3), 358-373.
Ferretti, M. (2021). New appetite for the monitoring of European forests. Annals of Forest Science, 78(4), 1-4.
Fischer, C., & Traub, B. (Eds.). (2019). Swiss National Forest Inventory-methods and models of the fourth assessment. Berlin/Heidelberg, Germany: Springer.
Grafström, A., Lundström, N. L., & Schelin, L. (2012). Spatially balanced sampling through the pivotal method. Biometrics, 68(2), 514-520.
Greenwood M. (1946). Recent experiments in statistical sampling in the Indian Statistical Institute, Journal of the Royal Statistical Society 109, 325–378.
Gregoire TG, Valentine HT. (2008). Sampling strategies for Natural Resources and the environment. Chapman & Hall/CRC.
Hervé J-C. (2016). France. In: Vidal C, Alberdi I, Hernández L, Redmond J, editors. National Forest Inventories - Assessment of wood availability and use. Springer, p. 385–404.
IFN (2009) L’IF n°21, 1er trimestre 2009. IFN. Tempête Klaus du 24 janvier 2009 : 234 000 hectares de forêt affectés à plus de 40 %, 42,5 millions de mètres cubes de dégâts. 12 pages (In French) https://inventaire-forestier.ign.fr/IMG/pdf/IF21_internet.pdf
Madow, W.G. and Madow, L. H. (1944). On the theory of systematic sampling I. Ann.Math.Statist., 15, 1-24.
Magnussen, S., & Fehrmann, L. (2019). In search of a variance estimator for systematic sampling. Scandinavian Journal of Forest Research, 34(4), 300-312.
Mahalanobis, P. C. (1944). On large-scale sample surveys. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 329-451.
Mandallaz, D. (2007). Sampling techniques for forest inventories. Chapman and Hall/CRC.
Mandallaz, D. (2013). Design-based properties of some small-area estimators in forest inventory with two-phase sampling. Canadian Journal of Forest Research, 43(5), 441-449.
Martino, L., Fritz, M., 2008. New Insight into Land Cover and Land Use in Europe. Land Use/cover Area Frame Statistical Survey: Methodology and Tools. Statistics in focus 33. Eurostat, Luxemburg.
McGarvey, R., Burch, P., & Matthews, J. M. (2016). Precision of systematic and random sampling in clustered populations: habitat patches and aggregating organisms. Ecological applications, 26(1), 233-248.
Michal R and Zajac S (2010). Country report of Poland, National Forest Inventories—Assessment of Wood Availability and Use, 1st ed.; Vidal, C., Alberdi, I., Hernandez, L., Redmond, J., Eds, pp 425-436.
Overton, W. S., & Stehman, S. V. (1993). Properties of designs for sampling continuous spatial resources from a triangular grid. Communications in Statistics--Theory and Methods, 22(9), 251-264.
Overton, W. S., & Stehman, S. V. (1996). Desirable design characteristics for long-term monitoring of ecological variables. Environmental and Ecological Statistics, 3(4), 349-361.
Quenouille, M. H. (1949). Problems in plane sampling. Annals of Mathematical Statistics, 20(3), 355-375.
Reams GA, Smith WD, Hansen MH, Bechtold WA, Roesch FA, Moisen GG. 2. In: Bechtold WA, Patterson PL, editors. The Forest Inventory and Analysis Sampling Frame. vol. 80 of General Technical Report SRS. Southern Research Station; 2005. p. 11–26.
Ripley, B. D. (1991). Statistical inference for spatial processes. Cambridge university press.
Roesch, F.A. (2007). Compatible estimator of the components of change for a rotating panel forest inventory design. Forest Science 53(1): 50-61.
Roy, A.S., & Singh, M.P. (1973) Interpenetrating sub-samples – with or without replacement. Metrika 20, 230-239.
Stehman, S. (1992). Comparison of systematic and random sampling for estimating the accuracy of maps generated from remotely sensed data. PE & RS-Photogrammetric Engineering and Remote Sensing, 58(9), 1343-1350.
Stehman, S. V. (2009). Sampling designs for accuracy assessment of land cover. International Journal of Remote Sensing, 30(20), 5243-5272.
Stevens, D. L. (1997). Variable density grid‐based sampling designs for continuous spatial populations. Environmetrics: The official journal of the International Environmetrics Society, 8(3), 167-195.
Stevens, D. L., & Olsen, A. R. (2004). Spatially balanced sampling of natural resources. Journal of the American statistical Association, 99(465), 262-278.
Strand, G. H. (2017). A study of variance estimation methods for systematic spatial sampling. Spatial Statistics, 21, 226-240.
Tabacchi, G., De Natale, F., Floris, A., Gagliano, C., Gasparini, P., Scrinzi, G., & Tosi, V. (2007). Italian national forest inventory: methods, state of the project, and future developments. In In: McRoberts, Ronald E.; Reams, Gregory A.; Van Deusen, Paul C.; McWilliams, William H., eds. Proceedings of the seventh annual forest inventory and analysis symposium; October 3-6, 2005; Portland, ME. Gen. Tech. Rep. WO-77. Washington, DC: US Department of Agriculture, Forest Service. 55-66. (Vol. 77).
Tomppo E, Gschwantner T, Lawrence M, McRoberts RE, editors. 2010. National Forest Inventories - Pathways fo common reporting. Springer.
UN ECE (2021) Air. https://unece.org/environment-policy/air
Van Deusen, P. C. (2000). Pros and cons of the interpenetrating panel design. P. 14–19 in Proc. of the First annual forest inventory and analysis symposium, McRoberts, R.E., G.A. Reams, and P.C. Van Deusen (eds.). Gen. Tech. Rep. NC-213. US Forest Service, North Central Research Station, St. Paul, MN. Available online at nrs.fs.fed.us/pubs/4369.
Van Deusen, P. C. (2001). Issues related to panel creep. In Third Annual Forest Inventory and Analysis Symposium (p. 31).
Van Deusen, P. C. (2002). Comparison of some annual forest inventory estimators. Canadian Journal of Forest Research, 32(11), 1992-1995.
Vidal, C., Bélouard, T., Hervé, J. C., Robert, N., & Wolsack, J. (2007). A new flexible forest inventory in France. In In: McRoberts, Ronald E.; Reams, Gregory A.; Van Deusen, Paul C.; McWilliams, William H., eds. Proceedings of the seventh annual forest inventory and analysis symposium; October 3-6, 2005; Portland, ME. Gen. Tech. Rep. WO-77. Washington, DC: US Department of Agriculture, Forest Service: 67-73. (Vol. 77).
White, D., Kimerling, J. A., & Overton, S. W. (1992). Cartographic and geometric components of a global sampling design for environmental monitoring. Cartography and geographic information systems, 19(1), 5-22.

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

A square-grid sampling support to reconcile systematicity and adaptivity in the periodic spatial survey of natural resources

Status:

Version 1

Abstract

Figures

1. Introduction

2. Homothetic properties of a square grid: a basis for survey systematicity, periodicity and variable sampling intensity

n = a² + b² (Eq. 1)

n m = (a² + b²) (c² + d²) = a²c²+ a²d²+ b²c²+ b²d² + 2 abcd – 2abcd = (ac+bd)² + (ad-bc)² (Eq. 2)

3. Multilevel systematic and quasi-systematic sampling designs in the French national forest inventory survey: an example of sampling flexibility in practice

4. Discussion

5. Conclusions

Declarations

References

Additional Declarations

Supplementary Files

Status:

Version 1

A square-grid sampling support to reconcile systematicity and adaptivity in the periodic spatial survey of natural resources

Status:

Version 1

Abstract

Figures

1. Introduction

2. Homothetic properties of a square grid: a basis for survey systematicity, periodicity and variable sampling intensity

n = a2 + b2 (Eq. 1)

n m = (a2 + b2) (c2 + d2) = a2c2 + a2d2 + b2c2 + b2d2 + 2 abcd – 2abcd = (ac+bd)2 + (ad-bc)2 (Eq. 2)

3. Multilevel systematic and quasi-systematic sampling designs in the French national forest inventory survey: an example of sampling flexibility in practice

4. Discussion

5. Conclusions

Declarations

References

Additional Declarations

Supplementary Files

Status:

Version 1

n = a² + b² (Eq. 1)

n m = (a² + b²) (c² + d²) = a²c²+ a²d²+ b²c²+ b²d² + 2 abcd – 2abcd = (ac+bd)² + (ad-bc)² (Eq. 2)