DNA Metabarcoding Reveals Cryptic Geographically-inuenced Microbial Diversity After Anthropogenic Impact in Original Forest Soils on the Isolated Trindade Island, South Atlantic

Located 1,140 km from the South American coastline in the South Atlantic Ocean, and with an age of 4 million years, Trindade Island is the most recent volcanic component of Brazilian territory. Its vegetation was severely damaged by human inuence in particular through the introduction of exotic grazing animals such as goats. However, since the complete eradication of goats and other feral animals in the late 1990s, the island’s vegetation has been recovering and even some endemic species that had been considered extinct have been rediscovered. In this study we set out to characterize the contemporary microbial diversity of Trindade Island forest soils using metabarcoding by High Throughput Sequencing (HTS). Sequences of representative of two domains (Bacteria and Archaea) and ve kingdoms (Fungi, Metazoa, Protozoa, Chromista and Viridiplantae) were identied. Bacteria were represented by 20 phyla and 116 taxa, while and Archaea by only one taxon. Fungi were represented by seven phyla and 250 taxa, Viridiplantae by ve phyla and six taxa, Protozoa by ve phyla and six taxa, Metazoa by three phyla and four taxa and Chromista by two phyla and two taxa. Even after the considerable anthropogenic impacts and devastation of the island’s natural forest, our sequence data revealed the presence of a rich, diverse and complex diversity of microorganisms, invertebrates and plants.


Introduction
Located 1,140 km from the South America coastline in the South Atlantic Ocean, Trindade Island is the most recent volcanic component of Brazilian territory, with an age of approximately 4 million years [1,2,3]. Discovered in 1501 by the navigator João da Nova [4], Trindade has been sporadically settled over the last ve hundred years. The rst o cial record of landing on the island dates to the year 1700 with the arrival of English astronomer Edmund Halley. Disputed by the United Kingdom and Portugal throughout the 18 th century, Trindade was occupied in 1783 by Portuguese military and civilian personnel (including six Azorean couples), but was later abandoned [4] Since then, the island has only been occupied for short periods, mostly by slave traders and pirates, and during the two world wars [5] In 1957, the Brazilian Navy constructed a garrison on the island, and the island has been continuously occupied since then. Currently, the island hosts a military meteorological station, the Trindade Island Oceanographic Post (POIT), which also includes a science research station, and has been o cially declared a federally protected area.
Little is known of the original native vegetation of Trindade Island as it was severely damaged by the introduction of exotic animals during the earlier phases of human occupation. The expedition led by Edmund Halley, for example, introduced goats to the island, which were abandoned, became feral and subsisted on native vegetation [6]. Today, most land below 400 m a.s.l. is characterized by open grasslands (about 60% of the area is dominated by Poaceae and Cyperaceae), with higher altitudes hosting the native "giant fern forest" dominated by the single species Cyathea delgadii Sternb. [7], analog to the high-altitude Cycad forests found in Tristan da Cunha [8]. Alves (2006) [9] reported 130 species of vascular plant, including ve endemics, while Dantas et al. (2017) [10] described one additional endemic species of Peperomia from the island. Faria et al. (2012) [11] investigated the bryophytes, listing 32 species of liverwort, 11 species of moss and one hornwort. Twelve species of freshwater algae were reported by [6]. The island is surrounded by a rich marine environment with many endemic gastropods and sh and it is an important breeding area for the marine turtle Chelonia mydas [6,12,13,14,15,16]. Virtually no data are available on most groups of soil microorganisms (Bacteria, Fungi, Protista) from Trindade Island.
Since the complete eradication of goats and other feral animals in the late 1990s, the island's vegetation has been recovering [17,18,19]. This vegetation recovery has likely relied on seeds, spores and propagules already present in the soil which, if viable, could develop under favorable conditions, or by new long-distance colonization events with propagules carried by agents such as wind, humans, and other organisms. Biological connections between Trindade and mainland South America have been demonstrated [20,21,7].
Most studies of Trindade Island biodiversity have relied on traditional morphological identi cation, with the exception of a small number of phylogenetic studies [10,7]. Morphological studies have important limitations due to the inherent di culties of identi cation of resting stages, spores, and small propagules. Recent developments in molecular biology, such as DNA metabarcoding by High Throughput Sequencing (HTS), have provided a promising and powerful tool to access previously unrecognized biodiversity (e.g., [22,23]), and in some cases with application to very old samples [24]. The recent use of such approaches in studies of Antarctic soils has yielded evidence of the presence of much greater diversity of algae and fungi than studies based only on morphological approaches (e.g., [25,26,27]). DNA metabarcoding has not previously been applied in investigation of soil biological diversity in Atlantic islands such as Trindade Island. In this study we set out to describe the contemporary microbial diversity present in the Giant Fern Forest soils in Trindade Island using metabarcoding.

Study area
Trindade Island is located at 20°29'-20°32'S and 29°17'-29°21'W ( Figure 1) in the South Atlantic Ocean. Rising 5,500 m from the surrounding ocean oor, Trindade has an area of 13.5 km 2 and is predominately rugged, being considered the most heterogeneous and topographically varied Brazilian volcanic island [1,28]. It includes the only volcanic crater the currently exists in Brazil (extinct).
Trindade Island soils are shallow, including Leptosols, Regosols, Cambisols, Andosols and endemic Histosols [35,36,37]. The latter are organic soils, formed by association between the volcanic (phonolites) and organic (primarily vegetation) parent materials. Its main characteristic is the presence of a super cial organic horizon, formed by vegetable bers or decomposed organic materials, called the H horizon (predominance of bers) or O horizon (predominance of humi ed organic matter) [29]. On Trindade Island, these have been described as acidic, nutrient-poor soils exclusively associated with giant fern forests [35,38,36,39,40].
In uenced by the South Atlantic Subtropical Anticyclone (SASA), which contributes to intense oceanic evaporation in this area [1,41] the climate in Trindade Island varies with altitude, with some areas showing semiarid characteristics and other areas characterized by higher humidity, such as Desejado Hill (620 m altitude), where the Histosols are located [25].

Soil sampling
Based on previous studies [35,36,39], three soil samples were collected from the organic horizons of Trindade Histosols in October 2018 using spatulas previously disinfected with 70% alcohol and stored in sterilized WhirlPak bags (Sigma-Aldric, USA). Two samples were obtained from Desejado Hill (PD5 and PD6; Figure 1E, F), and one from Fazendinha (PF7; Figure 1G) ( Table 1). The organic horizons in PD5, PD6, and PD7 were sampled, respectively, to depths of 10, 5 and 25 cm. Soil physical and chemical analysis were performed following Teixeira et al. [42]. Granulometry was performed using the pipette method (50 rpm, 16 h).
pH was determined using a 1:5 soil:deionized water ratio. Potential acidity (H+Al) was extracted with 0.5 mol·L −1 Ca(OAc) 2 buffered to pH 7.0 and quanti ed by titration with 0.0606 mol·L −1 NaOH. Exchangeable Ca 2+ , Mg 2+ , and Al 3+ were extracted with 1 mol·L −1 KCl, and Na + , K + and P + were extracted with Melich-1 [42]. The element levels in the extracts were determined by ICP (Al 3+ ), ame emission (Na + and K + ) and photocolorimetry (P) by the ascorbic acid method. Organic matter content (OM) was quanti ed by wet oxidation with 0.167 mol·L −1 of K 2 Cr 2 O 7 with sulfuric acid with external heating [43]. All analyses were performed in triplicate. Total cation exchange capacity (CEC) was calculated as the sum of the bases (Ca 2+ , Mg 2+ , Na + , K + ), and potential acidity (H + +Al 3+ ).

DNA extraction, Illumina library construction and sequencing
Total DNA was extracted using the QIAGEN Power Soil Kit (QIAGEN, Carlsbad, USA), following the manufacturer's instructions. DNA quality was analyzed by agarose gel electrophoresis (1% agarose in 1×Trisborate-EDTA) and then quanti ed using Quanti-iT™ Pico Green dsDNA Assay (Invitrogen). The internal transcribed spacer 2 (ITS2) of the nuclear ribosomal DNA was used as a DNA barcode for molecular species identi cation of Chromista, Protozoa, Viridiplantae and Fungi [44,45]

Forest soil properties
The analyzed soils were acidic, with a pH in H 2 O < 4 ( Table 2). Despite the high values of CEC, pH was controlled by the potential acidity (H+Al), which indicates a large reserve of H + protons. This is con rmed by the low base saturation values (PBS < 50%), which characterize all organic horizons as dystrophic. The most exchangeable cation was Ca, which may have its origin associated with geological substrates, but also with the decomposition of organic matter, mainly from fern stalks. The organic matter content in the ne earth was high, although mainly in the form of vegetable bers. Among micronutrients, Fe stood out, whose main source is the weathering of ferromagnesian minerals in the geological substrate. Physically, clay and silt fractions predominated in the ne earth (< 2 mm) composition, characterizing the texture of all horizons as clay-loam.

Diversity
The calculated rarefaction curves for all taxa investigated approached a plateau, indicating that the reads gave an accurate representation of the local sequence diversity (Fig. 2). A total of 594,009 reads were generated of which 111,760 (19%) remained after quality ltering. Considering each marker for 16S (Bacteria), a total of 433,579 DNA reads was generated and 82,306 reads (19%) remained after quality ltering. For ITS (Fungi and Plant) a total of 160,430 reads was generated and 29,454 reads (18%) remained after quality ltering. Sequences from two prokaryotic domain (Bacteria and Archaea) were detected, representing 25 taxa in four phyla. Sequences representative of ve Eukaryota kingdoms were detected, of which Fungi included seven phyla and 255 taxa, Chromista one phylum and 28 taxa, Protozoa three phyla and four taxa, Metazoa three phyla and four taxa and Viridiplantae four phyla and nine taxa (Suppl. Table 1). In total, sequences of 294 taxa were detected, of which 104 were shared between both sampling locations (Fig. 3).
Relative abundances are shown in Fig. 4 Bacteria and Archaea The analysis showed the presence of 116 taxa belonging to the domain Bacteria and 1 taxon to the domain Archaea. The latter belonged to the class  (Figures 5, 6).
(Basidiomycota) and Mortierella humilis (Mortierellomycota) were the most abundant fungal taxa detected. However, the majority of the sampled fungal communities comprised intermediate and rare taxa. Fifty-three (21.2%) of the ASVs could be identi ed to higher taxonomic levels (kingdom, phylum, class, order, family) and may therefore represent taxa not present in the databases consulted or be currently undescribed species (Suppl. Table 1).
The fungal assemblages of the three soils from giant fern forest differed (Figures 8-10 Figure 12 shows the relative abundance of the sequences obtained.

Viridiplantae
Sequences of four phyla were detected, Anthophyta, Bryophyta, Chlorophyta and Monilophyta. Among the Anthophyta two species were found, Begonia cathayana present at PD5 and PF7 (21.2% of total Viridiplantae reads) and Vachellia gummifera present at PD6 (3.6% of total Viridiplantae reads). The sole Bryophyta detected, Campylopus oerstedianus, was present only at PD6 (11% of total Viridiplantae reads). The most diverse group was Chlorophyta with 6 taxa: Family Chlamydomonadaceae present only at PD6 (8.7% of total Viridiplantae reads), Asterochloris sp. present only at PD6 (6.3% of total Viridiplantae reads), Chlamydomonas sp. present only at PD5 (3.4% of total Viridiplantae reads), Chloromonas sp. present only at PD6 (.7% of total Viridiplantae reads), Eremochloris sphaerica present only at PD6 (10.4% of total Viridiplantae reads) and Trebouxia sp. present at PD5 (2.2% of total Viridiplantae reads). The sole fern detected was Alsophila gigantea, which was found at PD5 and PF7 (24% of total Viridiplantae reads). Figure 13 shows the relative abundances of the sequences detected. At Desejado Hill a total of 260 taxa were detected, with a lower number of 138 at Fazendinha, although a high number of taxa occurred at both sampling locations ( Figure 14).

Discussion
Our DNA sequence data point to the presence of diverse and abundant assemblages of organisms previously not recorded on Trindade Island. The presence of some organisms may indicate the consequences of human in uence while others are consistent with the known local diversity. However, we also recognize that the detection of fragments of an organism's DNA does not con rm the presence of living organisms or viable propagules, as the nding can be related to encysted forms, spores, pollen or even single cells, while sometimes dead tissue can also provide detectable DNA. Sequence assignment also relies on the quality and completeness of data available in existing sequence databases.

Bacteria and Archaea
The present study detected few sequences representing the domain Archaea in samples PD5 and PF7, belonging to the class Thermoplasmata, phylum Euryarchaeota. Recent studies have shown that the relative abundance of these Archaea among all archaeal sequences in soil can be higher than previously reported [63]. Themoplasmatales is the order of Archaea with most organisms in culture and the only one validated under the rules of the International Code of Nomenclature of Bacteria. They comprise extreme acidophiles that lack a cell wall [64] and are usually found in volcanic continental areas; they derive their energy from aerobic respiration but can grow anaerobically using iron or sulfur in their metabolism, which contributes to their prevalence in acid mine drainage communities [65]. The sequences identi ed here could not be classi ed at the order level, and the physiology of these microbes remains unknown.
The bacterial community was dominated by Actinobacteria, Proteobacteria, Acidobacteria, Chloro exi and Verrucomicrobia. This dominance is generally typical of soil bacterial community composition in soil samples [66].
Among the representatives of the phylum Actinobacteria, the class Acidimicrobiia stands out, as it is typically not as dominant as the classes Actinobacteria and Thermoleophilia in soils. Most cultured members of the class Acidimicrobiia are extreme acidophiles, originally obtained from geothermal or iron-rich mining sites and belonging to the family Acidomicrobiaceae [67]. However, these taxa were not detected in the current study and most sequences belonging to this group could not be further classi ed, indicating that they belong to unknown groups. The bacterium IMCC26256 was the most abundant identi ed sequence of Acidimicrobiia found in the Trindade Island samples, a taxon rst described in a freshwater study and that forms a separate branch within this class [67]. Among the genera identi ed here were Luedemannella and Mycobacterium, taxa found in soil, and Acidothermus, found in acidic geothermal springs [67].
Only the classes alpha and gamma proteobacteria were identi ed in Trindade Island soil samples. However, it should be noted that the classi cation system used in the present study combines the class formerly known as betaproteobacteria with the gamma proteobacteria. Members of alpha proteobacteria are metabolically very diverse and include nitrogen-xing bacteria [68]. Although the majority of the sequences assigned here belong to the order Rhizobiales, none of the identi ed taxa are nitrogen-xing, but are genera found in soil and the rhizosphere. Sequences assigned to the orders Elsterales and Micropepsales were also abundant, but there are no cultured representatives of these orders and their physiology is unknown. Members of the gamma proteobacteria mostly represented the orders Burjkholderiales and Pseudomonadalles, groups that are common in soils. All three soil samples examined here contained high numbers of the genus Acidibacter, an acidophile known from iron-rich mine sites [69].
The Acidobacteria sequences assigned in this study belong to taxa usually found in acidic environments, such as members of the class Acidobacteriia (former subgroups 1,2,3,5,11,12,13,14,15,24) [70]. Trindade Island soil samples included members of the class Acidobacteriia along with the Class Vicinamibacteria, which are heterotrophic aerobic bacteria, and class Holophagae, an anaerobic group. Their presence is consistent with bacterial communities from acidic, iron-rich soils, and possibly associated with geothermal sources. Many of the soil bacterial taxa assigned here are unknown and further research is necessary to understand their roles in Trindade Island soils.

Fungi
The most dominant sequences assigned in the giant fern forest soils were Sclerotiniaceae sp., Antarctomyces psychrotrophicus, Pseudogymnoascus sp., Apiotrichum sp. and Mortierella humilis, which represent fungi from different phyla. The family Sclerotiniaceae includes 47 genera and 284 species, many of which are pathogenic or saprophytic taxa, which able to infect various plant species and tissues. These fungi are characterized by the formation of sclerotia and stalked apothecia located within the colonized host plant tissue [61]. The genus Antarctomyces includes only two known species, A. psychrotrophicus and A. pellizariae, which were originally described from Antarctica. Antarctomyces psychrotrophicus was originally described in soil samples from King George Island [71] and later reported from other Antarctic habitats [72] but not elsewhere, this is the rst record outside Antarctica. Pseudogymnoascus (anamorphic form-genus Geomyces) has a wide distribution globally [73,74] and has been reported in soils from Arctic, alpine, temperate, and Antarctic regions [75,72]. Pseudogymnoascus taxa are capable of colonizing and utilizing different carbon sources and can be particularly abundant at lower temperatures [76]. Pseudogymnoascus has received attention due the pathogenic species, P. destructans, the causative agent of whitenose syndrome (WNS) in bats in temperate regions [77]. Desoria is a genus of about 100 species, with a wide distribution in the Northern Hemisphere, including in glacial-in uenced habitats, while some species have a nity to anthropogenically disturbed areas. Having limited dispersal abilities, the genus may have been introduced in Trindade by humans, as is the case for D. trispinata, which is found in the Azores [90] and colonized the island in the 18th century. Lepidocyrtus is one of the largest collembolan genera worldwide [90]. Lepidocyrtus koreanus is an Asiatic species, but this speci c assignment is likely an instance of database incompleteness. The presence of Collembola on Trindade Island was cited by Alves [6], but this may represent an error as the species cited is a true insect rather than a collembolan. However, it is also extremely unlikely that the island does not host a native, and likely highly endemic, collembolan community, as do other remote volcanic islands in the Atlantic [91,92,93,94]. Both of the taxa assigned here were obtained from the Desejado Hill sample (PD6) and have not been recorded from Trindade Island previously.
The nematode genus Heterocephalobellus includes only three species worldwide, all terrestrial, and was originally described from Brazil [95]. To date, only marine nematodes have been investigated in Trindade Island [96]. In the current study, assigned sequences were only obtained from Desejado Hill (PD5). The platyhelminth Rhynchoscolex simplex is a common species and has been recorded from São Paulo, mainland Brazil [97]. The assigned sequence was obtained only from Fazendinha (PF7). Both the nematode and platyhelminth are new records for Trindade Island, but appear likely candidates as anthropogenic introductions. Again, certainly the nematodes are likely to have a diverse native community on Trindade Island, requiring application of both appropriate survey techniques and molecular probes.

Chromista and Protozoa
Most of the reads representing these groups were assigned to Kingdom rank, excepting that of Sellaphora pupula, a cosmopolitan freshwater diatom [98], including in Brazil, Africa and the Azores. In Brazil this species is found in coastal regions and the Atlantic Rainforest [99]. The assignment here is the rst record from Trindade Island, where it was found only at Desejado Hill (PD5).
Cercomonadida was found only at Desejado Hill (PD6) and is the second most commonly recorded zoo agellate in soils globally [100,101]. Within this group, Eocercomonas is a less known genus segregated from the widespread Cercomonas [101]. Eocercomonas echina is a freshwater and soil species previously reported only from the United Kingdom and South Korea [102] and was found here only at Fazendinha (PF7). Vahlkamp a is a poorly known genus including about 12 species that is abundant worldwide in a wide variety of aquatic and terrestrial habitats [103] and potentially pathogenic for humans [104]. It was found at both Desejado Hill (PD6) and Fazendinha (PF7). Thaumatomonas is another poorly known Northern Hemisphere genus commonly found in lake sediments [105], and assigned here only at Fazendinha only (PF7). All the protist sequences assigned in this study have not been previously reported from Trindade Island.

Viridiplantae
The four green algae (Chlorophyta) reported have not previously been recorded from Trindade Island [98]. Eremochloris sphaerica is a North American species common in brackish waters. Trebouxia is a cosmopolitan and widespread genus including more than 40 species [98] and has been found in almost every environmental condition including terrestrial and aerial. It is commonly found on tree bark in humid forests and is also a common photobiont of lichens [106]. Chlamydomonas is a genus including more than 200 species and is widely distributed in both fresh and sea water as well as in soil and snow [98]. Asterochloris is a genus with more than 19 cosmopolitan species and is one of the most common lichen photobionts [98]. All the green algae recorded here were present only at Desejado Hill (PD5 and PD6).
The moss Campylopus oerstedianus (Bryophyta) is a Central and North American species not previously reported from Trindade Island. Faria et al. [11] recorded two other species in this genus from Trindade Island, with further discussion of the identify of these representatives given by Gama et al. [20]. We recorded it only at Desejado Hill.
The fern Alsophila gigantea (=Cyathea gigantea) is an Asiatic species. Alves et al. [21] reported spores of a different species of Cyathea in air samples obtained on the island. However, the assignment generated in the current study is likely to re ect a lack of completeness in the available database, and mostly likely refers to C. delgadii, the single and dominant tree fern on the island. It is important to note that GenBank contain only 8 ITS sequences of Cyathea, not including C. delgadii, and that some of these sequences are very short (less than 300 bp). This further reinforces the importance of well curated and complete databases being available to support metabarcoding studies. Fern sequences were obtained from both sampling locations, Desejado Hill (PD5) and Fazendinha (PF7).
The two owering plants reported here have not previously been reported from Trindade. Begonia cathayana (Begoniaceae) is an ornamental Chinese species, however other members of the genus are common in the Brazilian Atlantic Rainforest [107]. Begonia propagules could reach the island by means of aerial transfer or with human assistance. In the current study on the sequences were detected at both Desejado Hill (PD5) and Fazendinha (PF7). Vachellia gummifera (Fabaceae) is a leguminous plant native to Morocco and the Mediterranean Sahara. Again, some related species, V. caven, V. farnesiana, V. ibirocayensis and V. seyal, are native to Brazil, where they occur in coastal regions [108,109]. The sequence was identi ed only at Fazendinha (PF7). It is important to note that, over time, many species have been introduced to Trindade Island with human assistance [6]. The number of these still present is not clear, but there remains the possibility that detectable DNA could remain and be detected using metabarcoding approaches.

Conclusions
Despite the history of considerable anthropogenic impact and devastation of pre-existing native ecosystems, the application of DNA metabarcoding revealed the presence of sequences indicating the presence of a rich, diverse, and complex DNA sequence diversity of microorganisms, invertebrates and plants in soils from the native fern forests of Trindade Island. A large majority of the identi ed diversity represented bacterial and fungal groups, and the use of further target genes in future studies is likely to identify further diversity of soil invertebrates and plants. Assignment of putative identities based on sampling of environmental DNA does not, however, con rm the presence of active or viable organisms or their propagules, and the precision of identi cations relies heavily on the quality and completeness of available databases. Further studies will be required to better understand the importance of these complex biological webs for forest soil health, conservation management and habitat restoration on Trindade Island. The study also highlights the potential for further application of metabarcoding approaches to elucidate the currently less-known elements of microbial, invertebrate, and lower plant diversity of other remote and often hard to access mid-and South Atlantic islands, many of which share the volcanic history of Trindade Island, such as the Cape Verde Islands, St. Helena, the Tristan da Cunha archipelago, and South Georgia.

Declarations
Localization of Trindade Island relative to Brazil in the South Atlantic Ocean (A), sampled points (B; upper red rectangle) in Fazendinha (C) and in Desejado Hill (C), and the soil pro les PD5 (E), PD6 (F), and PF7 (G). Modi ed from Google®.  Relative abundance of taxa found in both sampling locations. Taxa referred as 'Other' include Protozoa, Chromista, Viridiplantae and Metazoa.    Relative abundances of the most represented fungal sequences found in Trindade Island soil samples obtained in 2018. ITS sequences were clustered into OTUs by setting a 0.03 distance limit. Taxonomic assignments were determined using the following databases: UNITE Fungal ITS database 8.2, UNITE Eukaryotes ITS database 8.2, and NCBI.

Figure 11
Relative abundances of the metazoan sequences detected in Trindade Island soil samples obtained in 2018. ITS sequences were clustered into OTUs by setting a 0.03 distance limit. Taxonomic assignments were determined using the following databases: UNITE Fungal ITS database 8.2, UNITE Eukaryotes ITS database 8.2, and NCBI.

Figure 12
Relative abundances of Protista sequences detected in Trindade Island soil samples obtained in 2018. ITS sequences were clustered into OTUs by setting a 0.03 distance limit. Taxonomic assignments were determined using the following databases: UNITE Fungal ITS database 8.2, UNITE Eukaryotes ITS database 8.2, and NCBI.