Amplicon-Based Analysis of the Fungal Diversity Across Kenyan Soda Lakes

Background Microorganisms have been able colonize and thrive in environments characterized by low/high pH, temperature, salt or pressure. Examples of extreme environments are the soda lakes and soda deserts. The objective of this study was to explore the fungal diversity across soda lakes Magadi, Elmenteita, Sonachi and Bogoria in Kenya. A new set of primers was designed to amplify a fragment long enough for the 454-pyrosequencing technology. Results Analysis of the amplicons generated showed that the new primers amplied for eukaryotic groups. A total of 153,634 quality-ltered, non-chimeric sequences were used for community diversity analysis. The sequence reads were clustered into 502 operational taxonomic Units (OTUs) at 97% similarity using BLASTn analysis of which 432 were aliated to known fungal phylotypes and the rest to other eukaryotes. Fungal OTUs were distributed across 107 genera aliated to the phylum Ascomycota, Basidiomycota, Glomeromycotina and Incertae Sedis. The Phylum Ascomycota was the most abundant phylotype. Overall, fteen (15) genera (Chaetomium, Monodictys, Arthrinium, Cladosporium, Fusarium, Myrothecium, Phyllosticta, Coniochaeta, Diatrype, Sarocladium, Sclerotinia, Aspergillus, Preussia and Eutypa) accounted for 65.3% of all the reads. The Genus Cladosporium was detected across all the samples at varying percentages with the highest being water from Lake Bogoria (51.4%). Good’s coverage estimator values ranged between 97 and 100%, an indication that the dominant phylotypes were represented in the data. These results provide useful insights that can guide cultivation dependent studies in order to understand the physiology and biochemistry of the as yet uncultured taxa.


Introduction
Microorganisms have been able to not only colonize but also thrive under unique or extreme environmental conditions characterizes by low/high pH, temperature, salt or pressure. Examples of extreme environments are soda lakes which are characterized by high alkalinity (with pH values ranging between 9 -12) while Na + concentrations can reach saturation. Their surface area uctuates due to extensive evaporation attributed to the intense sunlight and low levels of precipitation experienced where they are located. Despite the extreme physicochemical conditions in the soda lake ecosystems, a high level of species diversity has been reported (Lanzen et (Gunde-Cimeman et al., 2009;. Different genera, including Cladosporium, Aspergillus, Penicillium, Alternaria and Acremonium sp. have been reported to exist as either moderately or weakly alkali tolerant species in saline environments (Grum-Grzhimaylo et al., 2013b). Isolates a liated to Chaetomium aureum, C. avigenum, Emericella nidulans, and Eurotium amstelodami have previously been isolated from the Dead Sea (Buchalo et al., 2000). Orwa et al., (2020) describe isolates spread over 18 fungal genera namely Aspergillus, Penicillium, Acremonium, Phoma, Cladosporium, Septoriella, Talaromyces, Zasmidium, Chaetomium, Aniptodera, Pyrenochaeta, Septoria, Juncaceicola, Paradendryphiella, Sarocladium, Phaeosphaeria, Juncaceicola and Biatriospora from Lake Magadi in Kenya. Other reports include Chaetomium globosum from the Dead Sea as well as saline habitats of Wadi El-Natrun (Perl et al., 2018), Sarocladium kiliense from Lake Sonachi in Kenya (Ndwigah FI, 2017).
High-throughput sequencing allows rapid estimation and identi cation of microorganisms without cultivation (Tedersoo et al., 2015). Using this apprach, a high prokaryotic and eukaryotic diversity has been reported from several alkaline lakes such as Magadi in Kenya (Salano et  sediments from Tibetan Plateau (Xiong et al., 2012). Therefore, a sequence-based approach has made it easier to understand diversity and structure of microbial communities in diverse environments (Han et al., 2017;Valenzuela-Encinas et al., 2008). Most of the next-generation sequencing technologies used in diversity studies have an ampli cation step. The earliest polymerase chain reaction (PCR) primers to gain wide acceptance in fungal studies were for the Internal Transcribed Sequences (ITS) described by White et al., (1990)  ). However, coverage as well as phylogenetic resolution to lower taxonomic levels is always a challenge especially when dealing with less explored habitats. In this study, we designed a new set of primers for next generation sequencing and tested them using different samples collected from different soda lakes in Kenya. The main objective was to explore whether fungal diversity varies between lakes Magadi, Elmenteita, Sonachi and Bogoria and across each lake due to differences physicochemical parameters. The study provides new insights on the spatial diversity across various soda lakes and with varying physicochemical parameters.

Materials And Methods
Description of study Sites and sampling design Study sites chosen for the study were the hypersaline lake Magadi (2 o 00'S and 36 o 13'E) at an elevation of 600m above sea level which lies in a naturally formed closed lake basin, an annual rainfall of approximately 500mm (Behr and Röhricht, 2000). The lake covers an area of 90 km 2 and evaporation is intense during the dry season. Lake Elmenteita (0°27'S, 36°15'E) is a moderately saline lake located 1776m above sea level and has no direct outlet. The lake is approximately 20 km 2 , but the total surface area changes with seasons often ooding during heavy rains. Lake Bogoria (0° 20'N and 36° 15'E) lies at an altitude of 975 m, low rainfall of 708 mm and has several geysers around the lake. The alkaline, saline crater lake Sonachi lies in a closed basin on the Eastern Rift valley (0° 49'S, 36° 16'E).

Sample collection and nucleic acid extraction
Wet sediment, water samples, microbial mats, dry sediments and grassland soil were collected from lakes Bogoria, Elmenteita, Sonachi and Magadi as described (Orwa et al.,2020). 1g of each soil or sediment sample was weighed into a sterile Eppendorf tube. For the water samples, 500ml was ltered through a 0.22µM lter, cut into small pieces with a sterile scalpel and transferred to a sterile 2ml tube. Total community deoxyribonucleic acid (DNA) was extracted using the phenol:chloroform protocol modi ed from Yeates et al., (1998). However, proteinase K was substituted with 6M Guanidine Isothiocyanate (GITC) for protein denaturation. Our experience has been that extraction of high molecular weight DNA from the soda lake samples using kits is problematic due to high salt content in the samples.  (Larkin et al., 2007). A consensus sequence was generated using Jalview alignment (Waterhouse et al., 2009) and used to design the reverse primer using GeneFischer tool (Giegerich et al., 1996). Both primers were edited using the JalView alignement (Waterhouse et al., 2009) and tested using the Probe Match tool from Ribosomal database project (RDP) (Cole et al., 2007;). The new primer pair ampli es a fragment of 712 bp covering the V3, V4, V5 and V6 regions (Neefs et al., 1990) as well as partial V2 and V7 regions of the 18S rRNA gene. The designed primers are Fung_576f (5'-GCTCGTAGTTGAACCTTTGG-3') and Fung_975r (5'-TCTGGACCTGGTGAGTTTC-3'). Thereafter the primers were modi ed for pyrosequencing by attaching an adaptor sequence, a key and a unique 12 Nucleotide MID for multiplexing purposes. Each PCR reaction (50µL) contained forward and reverse primers (10µM, each), dNTP's (10mM each), Phusion GC buffer (Finzymes), Phusion high delity polymerase (0.5U/µL -1 ) and 25 ng of template DNA. Cycling conditions were: initial denaturation at 98°C for 3 minutes followed by 25 cycles of denaturation at 94°C for 30 sec, annealing for 30 sec at 58°C, and extension at 72°C for 90 sec, and a nal extension step of 72°C for 5 min. Ampli cation was con rmed by separating 2µl of the PCR product on a 1% agarose gel run for 1h at 100 Volts. Later, three independent PCR products per sample were pooled in equal amounts, separated on a gel and extracted using the PeqGOLD gel extraction kit (PeqLab Biotechnologie GmbH, Erlangen, Germany). PCR products were quanti ed using a Nanodrop (PEQLAB Biotechnologie GmbH, Erlangen, Germany) and a Qubit uorometer (Invitrogen GmbH, Karlsruhe, Germany) as recommended by the manufacturers. Sequencing of the PCR derived amplicons was performed on a Roche GS-FLX 454 pyrosequencer and Titanium chemistry (Roche, Mannheim, Germany). The raw sequence reads have been deposited into the SRA under the accession SRP019052.

Sequence analysis
Sequence reads were denoised and evaluated for potential chimeric sequences using UCHIME within the USEARCH package v.5.1 (Edgar et al., 2011). OTU picking was done from the quality ltered, denoised, nonchimeric sequences using a sequence identity cut-off of 97%. Representative OTUs were picked using USEARCH v. 5

Evaluation of the new primer set
The newly designed primers ampli ed for eukaryotic groups only and no bacterial sequences were detected in this study. In each sample, the success rate for amplifying for fungal groups was above 90%, which was good performance for environmental DNA. The amplicons could also be assigned to taxonomy with high con dence owing to the sequence length generated using the 454 technologies. In addition, the samples used ranged from dry sediments to microbial mats and therefore good quality DNA is key to ampli cation.
Sequence data.
The clean data from 32 samples comprised of 153,634 quality-ltered, denoised and non-chimeric sequences with no singletons. Based on BLASTn analysis, the sequences were clustered into 502 OTUs at 97% similarity; of which 432 were a liated to known fungal phylotypes. Fungal OTUs per sample ranged from 13 in the Bogoria wet sediments (sample BWS10) to a high of 68 in the Dry sediments from Lake Sonachi (sample BDS10) as shown in Table 1. Diversity at the phylum level: The 432 fungal OTUs were distributed across 107 fungal genera a liated to the phylum Ascomycota, Basidiomycota, Glomeromycotina and Incertae Sedis. 3% of the sequences were clustered as unclassi ed fungal groups. The Phylum Ascomycota was the most dominant phylotype with the orders Capnodiales, Pleosporales, Hypocreales, Myrmecridiales, Sordariales and Xylariales being the most abundant. In this phylum, the order Pleosporales was the most diverse with 13 genera followed by the orders Capnodiales (9), Hypocreales (11) and Xylariales (11) that had 9 genera. Whereas the Ascomycota were the most abundant across the samples, we noted that in Elmenteita wet sediments the phylum Bacillariophytina accounted for 45% of the OTUs (Figure 1).
Sequences a liated to Basidiomycota were detected in a few of the samples and were distributed in the orders Agaricales, Boletales, Polyporales, Cysto lobasidiales, Filobasidiales, Sporidiobolales and Malasseziales. The phylum Glomeromycotina was represented by a single genus Allomyces, in the species Allomyces macrogynus at 0.2% abundance in the Sonachi wet sediments. Overall, we identi ed fteen (15) genera that constituted 65.3% of all the reads and these were Chaetomium, Monodictys, Arthrinium, Cladosporium, Fusarium, Myrothecium, Phyllosticta, Coniochaeta, Diatrype, Sarocladium, Sclerotinia, Aspergillus, Preussia, Eutypa. A further 20.2% of the OTUs that were a liated to the order Pleosporales could not be identi ed below the order level. The Genus Cladosporium was detected across all the samples at varying percentages with the highest being water from Lake Bogoria (51.4%).

Diversity across the different samples
We evaluated and compared the diversity across the 32 samples. We used different metrics (Richness, Simpson, Shannon, Evenness, Fisher and Good's coverage) to evaluate the alpha diversity across the samples. Good's coverage estimator values were between 97 and 100% (Table 1). This is an indication that the dominant phylotypes were represented in the data. This was even more evident when the samples were clustered by sample type (P=0.05). The lowest diversity was in the brine samples and the highest in the sediment samples (Fig. 3).
Differences in fungal diversity across the lakes.
Bray Curtis dissimilarity analysis (Fig. 4a) demonstrated that the samples were separated into 3 clusters.
The samples from Lake Bogoria formed a distinct cluster. This could be due to differences in OTU composition. For example, no OTUs related to the genus Cladosporium were detected in samples from Lake Bogoria, whereas the genus Myrothecium, Sclerotinia, Lasiodiplodia, and Peziza were only identi ed in Lake Bogoria samples (Supplementary Table S1).

Discussion
The overall diversity and signi cance of fungal communities in the soda lakes has not been understood owing the limited data available as compared to bacteria. The Kenyan soda lakes are in geographically remote areas that experience intense solar radiation; evaporation rates exceed precipitation rates hence there is concentration of salts which contributes to the elevated salinity levels. This may be one of the reasons why they are not well explored. The diversity reported so far has been based on culture dependent studies (Orwa et  ). This necessitates more isolation efforts in order to understand the physiology and metabolism of these novel groups.
In this study, the phylum Ascomycota accounted for more than 80% of the reads across the sample, with the most abundant subphylum being Dothideomycetes, followed by Sordariomycetes, Leotiomycetes, Eurotiomycetes, and Pezizomycetes. Sharma et al., (2016) reported that 98% of the isolates recovered from Lonar lake belonged to Ascomycota, sub-phylum Pezizomycotina. The Ascomycetes have also been reported to be dominant in marine sediments of Kongsfjorden, Svalbard (Zhang et al., 2015), constituting 54.8% of the OTUs. In marine sediments of Arabian Sea, Ascomycota were reported to be the most abundant phylum at 83% and the rest (17%) were Zygomycota (Soumya et al., 2013). However, Basidiomycota have been reported to be dominant in other hypersaline environments (Singh et al. 2011;Bass et al., 2007). The Genus Cladosporium was detected across all the samples with the highest relative abundance being in water from Lake Bogoria (51.4%). However, some of the phylotypes (Cladosporium sphaerospermum, Fusarium sp., and Penicillium sp.) have also been observed in marine sediments (Zhang et al., 2013;Soumya et al., 2013;Samuel et al., 2011). Chaetomium globosum has been isolated from the Dead Sea as well as saline habitats of Wadi El-Natrun (Perl et al., 2018) while isolates with close similarity to Sarocladium kiliense were recovered Lake Sonachi sediments by (Ndwigah FI 2017). Salinity as well as pH affects fungal growth and spore formation which in turn may affect the overall diversity (Grumgrzhimaylo et al., 2016). Production of extremolytes and extremozymes or accumulation of K ions and compatible organic solutes in the cells are ways of coping with osmotic stress (Raddadi et al., 2018;Plemenita_s et al., 2014;Roberts, 2005). Unique features such as the thick mycelium observed in Phoma herbarum are important in stress tolerance while pigments such as those produced by Zasmidium cellare, Aspergillus keveii, and Cladosporium velox enable them to thrive in the harsh environments (Liu et al., 2018;Orwa et al., 2020).
A key ecological question is whether the observed fungal groups originate from the terrestrial environment via runoff or are actual residents exclusive to the soda lake habitats. It is possible that run-off from the surrounding soil introduces spores into the lakes and which over time adapt to the haloalkaline environment. Previous culture dependent studies (Orwa et al., 2020) on Lake Magadi recovered isolates spread over fungal genera namely Aspergillus, Penicillium, Acremonium, Phoma, Cladosporium, Septoriella, Talaromyces, Zasmidium, Chaetomium, Aniptodera, Pyrenochaeta, Septoria, Juncaceicola, Paradendryphiella, Sarocladium, Phaeosphaeria, Juncaceicola and Biatriospora. These isolates grew better when lake water was used in media preparation as compared to synthetic mineral medium. Pawar, V. H. & Thirumalachar, (1966) observed differential growth between marine and terrestrial organisms of the same species. Those of marine origin grew better on seawater agar as compared to the terrestrial isolates that grew better on non-seawater agar. In summary, the diversity and function of most fungal taxa in the soda lake ecosystem remains poorly understood and therefore, a combination of traditional culture-based method and metatranscriptomics may help to answer important ecological questions.

Conclusions
The ndings from this study indicate that the soda lakes habits host diverse fungal communities that have adapted to the extreme physicochemical conditions in the lakes. This is supported by the occurrence of same taxa in lakes that are geographically separated. The ndings will be a valuable resource that will guide culture dependent studies while the data will be valuable for comparative studies.    Bray Curtis dissimilarity analysis showing samples clustering