Timber Identication in the Taxonomically Challenging Sapotaceae Family

The illegal timber trade is still rampant so robust identication and tracking techniques are necessary to combat this wildlife crime. To follow and enforce timber import laws and adjoining timber species identication, the identity of the botanical species must be well dened. Since the Sapotaceae family is known as a taxonomically challenging family, we focus in this study on the four most valuable Sapotaceae timber species from tropical Africa: Autranella congolensis (De Wild.) A.Chev., Baillonella toxisperma Pierre, Tieghemella africana Pierre and Tieghemella heckelii (A.Chev.) Pierre ex Dubard. The wood anatomical characteristic ber lumen fraction and Direct Analysis in Real Time – Time of Flight Mass Spectrometry (DART-TOFMS) are used to differentiate between the four species and to make inferences on the species delineation and taxonomic identity. Based on visual assessment of the boxplots for the ber lumen fraction measurements, two groups can be discerned: (1) A. congolensis and B. toxisperma, and (2) T. africana and T. heckelii. In addition, all Mann-Whitney U comparisons and the differences in underlying distributions (Kolmogorov-Smirnov) for the ber lumen fraction measurements were signicant between all species. However, when permutating the data within those two groups, signicant differences were still found. This could indicate that the differences based on the ber lumen fraction are more nuanced. The DART-TOFMS analysis shows that A. congolensis and B. toxisperma have distinct chemotypes, while T. heckelii and T. africana have remarkably similar chemotypes. Our results provide support for the possibility that T. africana and T. heckelii are more closely related than previously considered. A taxonomic study would be benecial to assess the species limits of T. heckelii and T. africana, as our results suggest they could be conspecic. This would have important implications towards the timber trade and adjoining timber species identication, for the Tieghemella species, and their conservation.


Introduction
Illegal logging and timber forensics It is estimated that 30 to 90% of timber from the tropics is illegally sourced [1][2][3][4][5][6]. Next to the straightforward ecological damages, there are also substantial economic and social problems associated with timber poaching [4]. These issues have sparked an increased demand in different timber identi cation and timber traceability techniques, with current frontrunners being wood anatomy, both traditional and with machine vision [7][8][9], Direct Analysis in Real Time Time-of-Flight Mass Spectrometry (DART-TOFMS) [10][11][12], genetic analysis [13] and stable isotope ratio analysis [6,14].
Wood anatomy, DART-TOFMS and genetic analysis are currently the most employed methods to determine the species identity of timber. However, timber import laws and adjoining timber species identi cation can only be followed if the identity of the botanical species is well de ned. Until three decades ago, taxonomists mainly used morphological traits to describe and delineate species. However, species can show high levels of intraspeci c morphological variation, which complicates accurate species delineation and occasionally results in the erroneous splitting of species. Conversely, differentiation and speciation are not always accompanied by morphological change, as demonstrated by the abundance of cryptic species [15][16][17], where two or more distinct species are classi ed under the same taxonomic unit because they are seemingly indistinguishable from a morphological point of view [15]. For this reason, it is important to include molecular data when new species are described and named. Though, even when DNA-based methods are incorporated, genetic divergence can remain undetected because of homoplasy (shared character that did not arise from a common ancestor) and evolutionary processes such as hybridization (production of offspring by parents from different varieties or species), chloroplast captures (introgression of a chloroplast genome from one species into another), reticulate evolution (or network evolution, where a group of organisms originates through the partial merging of ancestor lineages) or incomplete lineage sorting (common ancestry of gene copies at a single locus extends deeper than previous speciation events) [16,[18][19][20], resulting in wrongly delineated species.

Sapotaceae
The Sapotaceae family is known for its highly homoplasious morphological characters and the lack of unambiguous synapomorphies for subfamilies and tribes [21], which are the reasons for the high dynamics of the Sapotaceae taxonomy and the many taxon synonyms. Here, we will focus on the four most important Sapotaceae timber species from tropical Africa: Autranella congolensis (De Wild.) A.Chev., Baillonella toxisperma Pierre, Tieghemella africana Pierre and Tieghemella heckelii (A.Chev.) Pierre ex Dubard. All four species represent the largest trees in their respective forested regions, reaching heights of 50 m or more and diameters of sometimes more than 2 m.
Tieghemella africana is well known to the international timber trade as Douka [22]. This trade name can occasionally cover timber from B. toxisperma and is often considered as the same trade category as wood from T. heckelii (Makoré). However, T. heckelii is traded under the generic trade name (or pilot name) Makoré, which can include timber from T. africana and B. toxisperma. Tieghemella africana is typically found in the evergreen rainforests from Cameroon to Cabinda (Angola) in the west, and eastward to the Republic of the Congo and Democratic Republic of the Congo (DRC) [23]. The highest species densities are reported in Equatorial Guinea, western Gabon and in the Republic of the Congo, north of Kouilou. In other regions it can be mixed with T. heckelii and B. toxisperma. Heartwood of T. africana is very similar to T. heckelii, but tends to be more intensely stained with a more distinct vein pattern. In provenances from the Republic of the Congo, the wood has been noted to darken to a red violet stain. In addition, T. africana tends to be slightly harder and heavier than T. heckelii. The main distribution area for T. heckelii covers eastern Liberia, Côte d'Ivoire and Ghana, but the species also occurs in lower densities in Nigeria [24]. As such, the range of T. heckelii overlaps with other morphologically similar Sapotaceae species, creating a challenge for eld identi cation. Baillonella toxisperma and A. congolensis occur in low densities in the rainforest of southern Nigeria, Cameroon, Equatorial Guinea, Gabon, Cabinda (Angola), Republic of the Congo and the Democratic Republic of the Congo (DRC) [25,26]. Baillonella toxisperma (Moabi), can look very similar to T. heckelii but the distinction is clearer than for T. africana from Ghana or the Ivory Coast. Baillonella toxisperma is also found mixed with shipments of T. africana and T. heckelii. Autranella congolensis is reported to be falsely sold as B. toxisperma, but Autranella congolensis wood is harder and darker with a violet stain. Standing trees of B. toxisperma and A. congolensis are quite similar to each other, with one primary difference being B. toxisperma exhibits a distinctively atter crown [22].

Taxonomic history
Although the four Sapotaceae species in this study are currently assigned to three distinct genera, all four species were previously included in the genus Mimusops. Autranella congolensis seems to be related to the latter, but it differs in having stipules, a longer corolla tube and larger fruits [25], and was thus reinstated as a distinct (monotypic) genus. The genus Baillonella was rst described by Pierre based on a seed collected in Gabon [27]. Engler then included the genus as a section in the Mimusops, but the group was later reinstated as a separate genus because of the thin seed coat and particular nerves that distinguish it from the Mimusops [28]. While multiple Baillonella species have been described in the 1900s, B. toxisperma is currently the only recognised species. The genus Tieghemella was rst described by Pierre [27], but was later subsequently added to the genera Dumoria, Mimusops and Baillonella, after which Tieghemella was reinstated as a distinct genus. Currently, T. africana and T. heckelii are the only two species recognised in the genus. However, a taxonomic study is needed to assess the status of the genus and the species limits, since they may be conspeci c [23][24].

Study objectives
As indicated, these Sapotaceae species all have similar heavy and reddish-brown wood. Because of this similarity, the wood is used for similar purposes and often traded together under the same commercial name. As such it is important (1) to be able to identify these species within the timber trade and (2) to be certain that these are four different species. In this study, we will assess: 1. The robustness of the one wood anatomical characteristic that is claimed to allow for the differentiation of these four Sapotaceae species: the ber lumen fraction (referred to as the coe cient de souplesse by [29]). 2. The possibility to differentiate these four Sapotaceae species using chemical ngerprints via DART-

TOFMS.
3. The effect of (1) and (2) towards the species delineation and the taxonomic identity of these four Sapotaceae species.

Sampling
A total of 62 wood specimens were collected from the Tervuren Wood Collection (Royal Museum for Central Africa, Tervuren, Belgium) and three from the World Forest ID project [30] (see Table S1 in Supplementary Materials). Some of these wood specimens were used for wood anatomical analysis and all except one were used to obtain chemical ngerprints via DART-TOFMS. Two samples from the Tervuren Wood Collection also have a corresponding herbarium voucher at Meise Botanic Garden (BR) in Belgium (see Table S1).

Wood anatomical analysis
The anatomical differences between species determined via the IAWA list of microscopic features [31] on InsideWood [32] where compared to the anatomical slices obtained in this study. Anatomical crosssections (transversal) of 16 wood specimens (Table 1 and Table S1 in Supplementary Materials) were digitized at 20x magni cation using Stream Image Analysis Software (StreamMotion, Olympus, Tokyo, Japan) with a scanning stage (Märzhäuser Wetzlar, Wetzlar, Germany) and a UC30 camera (Olympus, Tokyo, Japan) mounted on a light microscope (BX60, Olympus, Tokyo, Japan). For each image, bers were used to determine the ber lumen fraction: Fiber lumen fraction = (diameter lumen / diameter ber) * 100 (%) Images were aligned in transversal direction and the ber lumen fraction was determined in two perpendicular directions on the ber (4 measurements per ber = 2 ber lumen fractions, Fig. 1). The average of those two measurements was taken as the ber lumen fraction of that ber. Notched boxplots were created using the ggplot2 package [33] in RStudio (Rstudio Team, 2016). Notched boxplots offer a quick visual check whether a statistical difference in mean can be expected. Normality of the data was checked using the Shapiro-Wilk test [34] and signi cant differences in mean were determined using the non-parametric independent 2-group Mann-Whitney U test [35]. To determine whether the underlying distributions of the data were different, the Kolmogorov-Smirnov test was used. Finally, 50 permutations were run in combination with the non-parametric independent 2-group Mann-Whitney U test to determine whether the comparisons indicate real differences in ber lumen fraction between the two species groups (A. congolensis/B. toxisperma and T. africana/T. heckelii) (see Results section). For the group A. congolensis/B. toxisperma, four Tw samples belonging to those two species were randomly picked and placed under A. congolensis, the same was done for B. toxisperma. This was repeated for each of the 50 permutations. Per permutation run, one sample was not used, as there are nine samples between those species (see Table 1). This was to keep the dataset balanced. The same was done for the species groups T. africana/T. heckelii with four samples randomly picked each permutation run per species.

Dart-tofms
The heatmap of the mass spectra shows that the ion pattern from 90-215 m/z was present in all four species (Fig. 3). Higher relative abundance at 409.163 m/z was noted for A. congolensis. Higher relative abundance in ions at 434.316, 440.326 and 452.310 m/z appeared to be indicative of the Tieghemella species. Baillonella toxisperma showed an ion at approximately 84.081 m/z, which was not found or signi cantly reduced in the other species, and also showed a higher relative abundance of the ion at 130.087 m/z.
The PCA scatterplot (Fig. 4) showed distinctive grouping for A. congolensis and B. toxisperma, while T. africana and T. heckelii group together. There appeared to be three outlier spectra, one from A. congolensis and two from B. toxisperma. The outlier of A. congolensis (Tw4300), and one from B. toxisperma (Tw2101), did not group with any other species class, while the other outlier from the B. toxisperma class (Tw1675) grouped with A. congolensis. These outliers may have been due to misidenti cations at the eld collection stage or human error. Regardless, they were removed from the PCA model, bringing the total number of ions to n = 792, and from subsequent analysis. Figure 4 The scatterplot visualizing the Principal Component Analysis of mass spectra from the four Sapotaceae species. All species exhibited separate clustering trends with the exception of the Tieghemella spp.
The DAPC model without the outliers and with T. africana and T. heckelii spectra in a single class, Tieghemella spp., showed distinctive grouping between the three classes (Fig. 5). The calculated LOOCV value for the DAPC model was 96.61%, indicating that two spectra (B. toxisperma Tw1666 and T. heckelii Tw22612) were misclassi ed. All test samples (n = 8) were correctly assigned ( Table 2).  Analysis of the Tieghemella species indicated that the species' chemotypes are remarkably similar (Fig. 6). Some variation in ion intensity can be seen between the species. However, this variation also changes from sample to sample (Fig. 3) while the overall ion pattern (Fig. 6) remains constant.

Wood anatomical analysis
Autranella congolensis had the lowest ber lumen fraction compared to the other species (Table 1), but the values showed some overlap with B. toxisperma due to the high standard deviation. The two Tieghemella species had a noticeably higher ber lumen fraction, with T. heckelii having the highest value. The notched boxplots of the ber lumen fraction measurements show that there appeared to be two groups (A. congolensis/B. toxisperma and T. africana/T. heckelii) and there was no overlap in notches for all four species (Fig. 2). Furthermore, all Mann-Whitney U (MW) comparisons, as well as  Autranella congolensis also has prismatic crystals present, which can be in the axial parenchyma cells. Finally, A. congolensis and B. toxisperma have a higher wood density compared to the Tieghemella spp. When comparing this description with the anatomical slices used in this study, we noticed some important differences. The specimens of A. congolensis and B. toxisperma also have vessel-ray pits with distinct borders. Moreover, it is not clear whether these species have vessel-ray pits of two distinct sizes. As such, this characteristic could easily be misinterpreted. All four species appear to have deposits of different proportions in their heartwood cells (mainly in ray and parenchyma cells). Our samples con rm the thick-walled bers for A. congolensis, however this also appears to be the case for B. toxisperma. In our samples, the Tieghemella species have thin-to-thick walled bers. Only Tw633 (Autranella congolensis) appeared to have prismatic crystals clearly present.
For the ber lumen fraction measurements, all Mann-Whitney U comparisons and the differences in underlying distributions (Kolmogorov-Smirnov) were highly signi cant (p < 0.001) between all species. Based on visual assessment of the box-plots for the ber lumen fraction measurements, two groups can be discerned: (1) A. congolensis and B. toxisperma and (2) T. africana and T. heckelii. This infers that misidenti cation between the two groups should not be possible if we use the ber lumen fraction measurements as the diagnostic characteristic. However, when permutating the data per sample, for A. congolensis and B. toxisperma, 37 out of 50 permutations were signi cant based on the independent 2group Mann-Whitney U test. For T. africana and T. heckelii this was 34 out of 50 permutations. This implies that even when the samples are randomly distributed across species (within one of the two groups), signi cant differences in ber lumen fraction are still possible. As such, this characteristic is not consistent enough for unambiguous timber identi cation of the discussed species, especially when insu cient material is present to study a representative fragment of the specimen.

DART-TOFMS analysis
For the DART-TOFMS analysis, the PCA plot containing all ions from the four species (Fig. 4)

Conclusion
In this study we assessed the wood anatomical characteristic ber lumen fraction and DART-TOFMS analysis for species differentiation of Autranella congolensis, Baillonella toxisperma, Tieghemella africana and Tieghemella heckelii. Based on visual assessment of the box-plots for the ber lumen fraction measurements, two groups could be discerned: (1) A. congolensis and B. toxisperma and (2) T. africana and T. heckelii. In addition, all Mann-Whitney U comparisons and the differences in underlying distributions (Kolmogorov-Smirnov) for the ber lumen fraction measurements were signi cant. However, when permutating our data within those two groups, signi cant differences based on the Mann-Whitney U test were still possible. This indicates that the differences based on the ber lumen fraction are more nuanced, implying that ber lumen fraction is not a consistent diagnostic characteristic for the identi cation of these four species. The chemotypes detected via DART-TOFMS of A. congolensis and B. toxisperma were distinct from each other and from those of Tieghemella spp., demonstrating that they can be identi ed by their chemotypes. Conversely, Tieghemella heckelii and T. africana have remarkably similar chemotypes that hinder species identi cation, though further taxonomic research is needed to assess whether they could be conspeci c. Our study shows that chemical pro ling can be used to reliably distinguish A. congolensis, B. toxisperma and Tieghemella spp. This has important implications as the ability to separate the members of taxonomically challenging groups, such as the Sapotaceae family, is of utmost importance to decrease the presence of illegally sourced wood within the timber trade.  Heatmap showing the chemical ngerprint of the samples; each row indicates a single spectrum. The xaxis shows the m/z-value while the y-axis shows sample number; relative abundance of the ion is portrayed through intensity of color, where darker shades indicate a higher relative abundance within the sample. Vouchered specimens are indicated by arrows.

Figure 4
Page 17/17 The scatterplot visualizing the Principal Component Analysis of mass spectra from the four Sapotaceae species. All species exhibited separate clustering trends with the exception of the Tieghemella spp. Comparison spectrum of T. heckelii and T. africana shows the similarities between the two species' spectra.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download. S1SuppMat.docx