Unveiling of Unique Microbiome Resource Having High Antimicrobial Peptide Activity Endowed With Agriculture and Industrial Applications From Pukzing Cave

Mir Asif Iquebal Indian Agricultural Statistics Research Institute Ajit Kumar Passari Mizoram University Jaisri Jagannadham Indian Agricultural Statistics Research Institute Farzana Ahmad Mizoram University Department of Biotechnology Vincent Vineeth Leo Mizoram University Department of Biotechnology Garima Singh Mizoram University Pachhunga University College Sarika Jaiswal Indian Agricultural Statistics Research Institute Anil Rai Indian Agricultural Statistics Research Institute Dinesh Kumar (  dinesh.kumar@icar.gov.in ) Indian Agricultural Statistics Research Institute https://orcid.org/0000-0001-5466-0686 Bhim Pratap Singh Mizoram University Department of Biotechnology


Introduction
Cave ecosystem is one of the poorly explored ecosystems on earth and is considered as an extreme environment, not appropriate for the development of life due to extreme abiotic conditions (Karolina and Urszula, 2016;Yasir, 2018). Though, majority of caves possess oligotrophic ecosystem with less than 2 mg of total organic carbon (TOC)/liter, availability of very less light, low temperature and high humidity, the cave environment is considered as speci c niche for specialized group of microorganisms including endemic as well as extremophiles (Engel et al., 2009;Schabereiter-Gurtner et al., 2002). Being oligotrophic conditions, the microbial population associated with caves is 10 6 cells/g of rock .
Studies predicted the potential of cave microbial ora and their role in geological processes with signi cant potential to produce bioactive molecules of biotechnological importance (Yasir, 2018). At the same time, the cave associated microbial communities also in uence the formation and preservation of cave deposits and serve as primary producers for the complex organisms (Barton and Northup, 2007).
Microbial interactions with their environment also play an important role in determining the shape and help in the deposit of wall deposits like stalactite (hanging rock from the roof of the cave), stalagmite (a shape arises from the oor of the cave), etc. (Banerjee and Joshi, 2014).
Exploration of caves is been gaining interest from the last two decades and several unique microbial strains having secondary metabolites production potential have been reported (Yasir, 2018;Cuezva et al., 2012;Portillo et al., 2008). Caves are located worldwide, and the available literature suggests the existence of noble genera in the caves located in Mexico (Quintana et al., 2013), Northern Thailand  and Spain (Jurado et al., 2005 and2006).
Although it is predicted that the microorganisms associated with caves could play an important contribution in understanding cave ecology, mineral formation and ecosystem bioenergetics Chimienti et al., 2016) Cave microbiology from Mizoram, India and its importance is still scantly reported and has given very less attention . Among the several caves located in Mizoram, Northeast India, Pukzing cave is recorded as the largest cave of the state with stretches of 25 meters which is located in Pukzing village near Marpara in the western hills of Aizawl district (http://www.incredible-northeastindia.com/mizoram/caves.html). So far, no systematic study is been carried out to understand the microbial population associated with one of the largest caves of Mizoram, Northeast India. This study was carried out for the rst time to determine the microbial population of this cave using culture-dependent and culture-independent methods. The present study was planned to have a clear-cut estimation of the bacterial population using the metagenomics approach as the culturedependent techniques leads to a high underestimation of microbial biodiversity due to the inability to grow most of the organisms on nutritional media (Torsvik and Overeas, 2002). As we know that the advancement in sequencing technologies and the use of high-throughput sequencing technologies has made the studies of microbial diversity easier and more informative. However, using metagenomics approaches to study the microbial diversity of any location has chances to lose some low abundant taxa as stated earlier (Stefani et al., 2015). Hence, the present study targeted both culture-based and cultureindependent methods to gain the most appropriate representation of the microbial population associated with the studies cave. The bacterial isolates obtained through culture-dependent methods in this study were screened for their antibacterial activities and the biosynthetic modular genes (PKS I, PKSII and NRPS) were detected in the potential isolates. This study indicated that the use of V4 region in cultureindependent method is more appropriate as compared to V3 as the V4 region has given more diverse group of bacterial population and also subsystem annotation of V4 region predicted two functions associated with amino acid and its derivatives and cofactors, vitamins, prosthetic groups, pigments functions which was not shown in case of V3 region.

Samples collection and physiochemical analysis
The fresh cave sediment sample (approximately 150g) were collected randomly from four different places within the Pukzing cave, situated at Pukzing village near Marpara in the western hills of Mamit District of Mizoram (23°21'44.2"N 92°25'53.6"E). The Pukzing cave which is the biggest cave of the state of Mizoram is about 25 m wide and is located at around 2100 m above sea level (Ray, 1993; Sati and Vangchhia, 2016). The collected samples were then mixed together to make a composite sample, placed in sterile Himedia Polythene Bags, brought into the laboratory and stored at 4°C in the refrigerator until use for the culturable study. While for the non culturable metagenomic DNA extraction work the sediment, samples were stored at -80°C. The pH and temperature of the cave sediment sample was measured using pH meter thermometers respectively. Further, total carbon, total hydrogen and total nitrogen were determined using CHN analyzer at SAIC, Tezpur University.

Culturable bacterial diversity estimation
2.2.1 Isolation of bacteria from a cave sediment sample One gram of soil sample was taken and mixed with 10 ml of sterile distilled water. The mixture solution was then separated into normal or pretreated using either cold treatment (15ºC), and hot treatment (55ºC). After treatment, the sample was serially diluted up to 10 − 1 to 10 − 5 dilution. 100 µl of each dilution was taken from each treatment and spread on six nutritional media plates. The plates were incubated at 28°C and 37°C for 2-3 weeks to observe the colonies of bacteria. The obtained cultures were re-streaked in their respective media and puri ed cultures were maintained at 4°C.

Media Composition
The nutritional media was used as follows: 1. Luria Bertani Agar media (10 g of Peptone; 10 g of sodium chloride; 5 g of yeast extract and 20 g of agar); 2. Tryptic Soya Agar Media (17 g of tryptone; 3 g of soya peptone; 5 g of sodium chloride; 2.5 g of dipotassium hydrogen phosphate; 2.5 g of dextrose and 20 g of agar); 3. Starch Casein Agar (10 g of starch; 1 g of casein powder; 37 g of seawater and 20 g of agar); 4. Tyrosine Agar Media; 5. Tap Water Yeast Extract Agar Media (5 g of yeast extract; 2 g of dipotassium hydrogen phosphate and 20 g of agar) and 6. Glycerol Asparagine Agar media.

2.2.3
Genomic DNA extraction and PCR ampli cation using 16S rRNA gene Total genomic DNA of the culturable bacteria isolates was extracted by using the Genomic DNA puri cation kit (Invitrogen Life technologies). The purity of the obtain DNA (µg/ml) was veri ed using µ-Drop™ Plate (Thermo Scienti c™ Multiskan™ GO Spectrophotometer). PCR ampli cation of 16S rRNA gene was performed by using universal primers PA: 5'-AGA GTT TGA TCC TGG CTC AG-3') and PH: 5'-AAG GAG GTG ATC CAG CCG CA-3' (Qin et al., 2009). The PCR reaction mixture preparations and its process were carried out as denoted in Passari et al. (2017). The ampli ed PCR product was run on 1.5% of agarose gel and visualized under gel documentation system XR + (Bio-Rad). The ampli ed product was puri ed using Pure-link PCR Puri cation Kit (Invitrogen) and was sequenced commercially at Sci-Genome Labs Pvt. Ltd, India.

Phylogenetic analysis
The obtained sequences were trimmed using Finch TV 1.4.1 version and then compared with the NCBI database using the BlastN search program. After that, the sequences were aligned using the Clustal W software packaged in MEGA 5.05 (Thompson et al., 1997). The aligned sequences were used to select a phylogenetic model based on using BIC scores (Bayesian Information Criterion) and AICc value (Akaike Information Criterion, corrected) (Nei and Kumar, 2000). A maximum-likelihood tree was constructed using MEGA 6.0 with Jukes-Cunter model for actinobacteria; Kimura 3-parameter model for gram-positive bacteria and Tamura 3-parameter model for gram-negative bacteria (Kimura et al., 1980). The robustness of the phylogenetic tree was tested by bootstrap analysis (1,000 replicates) using p-distance model (Felsenstein 1985 for screening as they are well known for different disease related to agriculture, domestic animals and human health. The bacterial sample was inoculated in Tryptone yeast extract broth medium (ISP medium1: 5 g of Casein Enzymic hydrolysate; 3 g of Yeast Extract) and incubated at 28 0 C, 150 rpm for 7-10 days. Cells were harvested by centrifugation at 8,000 rpm and the supernatant was collected into a fresh tube for testing the antimicrobial activity by agar well diffusion method (Saadoun and Muhana, 2008). The test pathogenic microbes were inoculated on a modi ed nutrient agar plate (5 g of glucose; 5 g of peptone; 3 g of beef extract; 5 g of sodium chloride and 20 g of agar in one liter of sterile distilled water) and wells of 6 mm diameter were prepared by using sterile cork borer. In each of the plates, wells were lled with 50 µl of clear supernatant of various bacterial sample isolates (test) and the plates were incubated at 37 ± 2 0 C for 24 h. The antimicrobial data were analyzed in replicates (mean ± standard deviation of mean replicates) using Microsoft Excel XP 2007, while one-way ANOVA was used to determine the difference between antimicrobial activities among the bacterial isolates by using SPSS software version 20.0.
The potential isolates based on antimicrobial activity were selected to detect antimicrobial biosynthetic negative control reaction mixture without DNA template of actinomycetes was also included with each set of PCR reactions. The PCR products were visualized as stated above.

Non-Culturable bacterial diversity
2.3.1 Bacterial community pro ling using Illumina pairedend sequencing with V3 and V4 variable regions The total DNA of the composite sample mix collected from Pukzing Cave was obtained by Fast DNA spin kit (QIAGEN, USA) as per the protocol. Paired-end Illumina sequencing (250p x 2) of the variable regions V3 and V4 was carried out at SciGenome Pvt. Ltd., Cochin, Kerela, India. Initially pre-processing of pairend reads in each sample was done with the Fastq-join method (Aronesty, 2011) to lter out any unpaired reads with uncertain bases. Demultiplexing with quality Phred score of ≥ Q20 is performed to remove bases with poor quality sequences. The taxonomic analysis was carried using Quantitative Insights into Microbial Ecology (QIIME) analysis pipeline (Caporaso et al., 2010). UCLUST algorithm (Edgar, 2010) was implemented for mapping, processed reads into operational taxonomic units (OTU's) with a similarity threshold of 97%. Further aligning of representative sequences and taxonomic classi cation was achieved with open reference picking method against the Greengenes reference database (DeSantis et al., 2016) and RDP classi er (Cole et al., 2009) respectively. Based on identi ed OTU's phylogenetic tree was obtained using FastTree method (Price et al., 2009). Before farther in-depth study on OTU's, lowabundance OTU's that are OTU's with minimum sample count, here those OTU having sample count 0 were identi ed and removed.

Taxonomic, statistical and functional analysis
The relative abundance of taxa at each taxon level is calculated using the in-build Perl script. Heat-map at the Phylum level was generated by calculating Spearman Correlation (Babicki et al., 2016). Weighted and Unweighted UniFrac distances (Lozupone and Knight, 2005) were calculated for ltered OTUs in all samples. MedCalc software was used to perform one-way ANOVA with a post-hoc test (MedCalc, 2018).
To study the compositional similarity between samples, Bray-Curtis similarity score was calculated using diversity indices in QIIME (Caporaso et al., 2010) based on comparison of pairwise taxonomic abundances from each sample against other samples followed by Non-metric Multi-Dimensional Scaling using PASTv3.11 software (Hammer et al., 2001). The phylogenetic tree predicted from FastTree was processed and visualized as phylogenetic cladogram with MEGAN software (Ondov et al., 2011).

Physiochemical analysis of the cave sediments
The pH values of the Pukzing cave sediment sample were found 7.5 whereas the temperature of the cave was recorded as 27°C. Moreover, total organic matter and phosphorus content of cave sediment samples were determined 1.5 ± 0.05% and 67.8 ± 0.2 ppm respectively. Further, total carbon (0.93%) and total hydrogen (0.50%) of cave samples were calculated using CHN analyzer.

Antimicrobial activity screening of culturable bacterial isolates
All the 235 isolates were tested for their antimicrobial activities against six bacterial pathogens P. aeruginosa, S. aureus, E. coli, M. luteus, B. subtilis and yeast C. albicans. Out of 235 isolates screened, 136 (57.87%) isolates (58 bacterial and 78 actinobacterial isolates) showed signi cant antimicrobial potential against at least ve of the tested pathogens. Interestingly, almost all the bacterial isolates exhibited antimicrobial activity against Bacillus subtilis (235 isolates) and Staphylococcus aureus (227 isolates) as indicated in Table 1.
The phylogenetic tree was constructed based on 16S-rRNA gene sequences values using Mega 6.0. In case of actinobacteria phylogenetic tree was assembled using the maximum likelihood method with Jukes-Cantor model according to lowest BIC (2190.170) and highest AIC (844.878). The gaps were treated by pairwise deletion and the estimated Transition/Transversion bias (R) was 0.50. The tree showed that the entire genus Streptomyces group was clustered together in one clade with the bootstrap supported value of 60%. Whereas, the other rare genera like Micromonospora was closely clustered with Pseudonocardia and Saccharopolyspora group with bootstrap supported value 55%. Interestingly, genera Pseudonocardia and Saccharopolyspora were represented under the same family. Further, very rare genera like Microbacterium, Kocuria, Micrococcus, Nocardiopsis, and Brevibacterium were clustered together in another clade with the bootstrap supported value of 63% ( Fig. 2A).
In case of gram-positive bacteria, the phylogenetic tree was also made with maximum likelihood method, but following the Kimura 2-parameter model according to the lowest BIC (5088.975) and highest AIC (4558.721) values. Here also, gaps were treated by pairwise deletion while the estimated Transition/Transversion bias (R) was found to be 1.27. The phylogenic tree revealed that all Bacillus genera were clustered together in one clade with the bootstrap supported value of 54% whereas all the Staphylococcus genera were clustered separately in another clade with the bootstrap supported value of 88% (Fig. 2B).
For gram-negative bacteria, the phylogenetic tree was also built by maximum likelihood method but by using the model of Tamura 3-parameter according to lowest BIC (3223.764) and highest AIC (2058.828) values. Gaps again were treated by pairwise deletion while here the estimated Transition/Transversion bias (R) was 1.08. The phylogenetic tree indicated that all isolates of genus Stenotrophomonas were closely clustered with genera Achromobacter in one clade with the bootstrap supported value of 58% whereas all isolates belonging to genera Pseudomonas was clustered with their type strains Pseudomonas oryzihabitans type strain NBRC10219 having bootstrap supported value of 51% respectively (Fig. 2C). Verrucomicrobia; Bacteroidetes; Actinobacteria & unidenti ed isolates) was recorded for V3 and V4 region of cave sediment sample (Fig. 3A). The phylum Archae group under Euryarchaeota and Crenarchaeota was only found in the V4 region of cave sediment sample (Fig. 3B).
The similarity in bacterial community structure was inferred from taxonomic data at species level using Bray-Curtis (BC) similarity score and therefore reduced using Non-metric Multidimensional Scaling at 2D space. During processing, taxonomic clades present in at least one sample with a relative OTU count of 100 and above are considered for similarity matrix calculation. Shorter linear distance denotes greater similarity between samples, but in Fig. 5 it is clearly visible that V3 and V4 samples linear distance is larger and distinct.
The phylogenetic tree generated using QIIME is compared for V3 and V4 samples and visualized with the MEGAN tool as cladogram (Fig. 6). MEGAN analysis of Illumina reads showed that Proteobacteria group (alpha, beta, delta and gamma bacteria) were matched together in one group. Similarly, to results from the MEGAN analysis, combination reads of V3 and V4 regions indicated that the PVC group (Planctomycetes and Verrucomicrobia), FCB group (Chlorobi and Bacteriodetes) and Acidobacteria (Acidobacteriaceae and Solibacterales) was also clustered separately. We have found that the larger group of the Terrabacteria (Actinobacteria, Deinococcus, Cyanobacteria, Chloro exi, Firmicutes, and Armatimonadetes) was clustered together according to the MEGAN analysis of the Illumina reads (Fig. 6).
There is also a group called unassigned which was found as unclassi ed bacteria in MEGAN analysis.
Rarefaction curve analysis and diversity index are inferred and plotted in Fig. 7. The alpha diversity analysis demonstrated that the V3 region indicating high diversity (35.0) as compared to V4 region (30.0) (Fig. 7A). The rarefaction curves showed that a linear relationship between taxa and species richness in the V3 region whereas no linear curve was found between taxa and species richness in the V4 region (Fig. 7B). Taxonomic and phylogenetic similarity using beta diversity is calculated with Bray-Curtis (B-C) similarity measure and FastUnifrac. Based on the average, we have observed that beta diversity is much higher in both phylogenetic and taxonomic similarity in V3 sample as compared to V4 sample (Fig. 8A-8D). The abundance of taxa in a sample computed describes the OTU richness in a sample with more richness in V4 sample as compared to V3 sample (Fig. 8B). At species level microbial communities that are shared and unique to each sample is described in Fig. 8C with 333 species shared in both V3 and V4 sample, 396 being unique to V3 and 129 species unique to V4 sample.
At the phylum level, heat map analysis is used to understand the taxonomic differences between the two variable regions (V3 & V4). Relative abundance at the phylum level using Spearman correlation (at P < 0.05 signi cance) is generated with a heat map where green signi es strong positive correlation, and pink represents a strong negative correlation (Fig. 9). A positive correlation means when one phylum was increased in one variable region (V3) than the other variable region increases at the same time with viceversa in negative correlation. We have observed that a positive correlation was much higher in the V4 region as compared to the V3 region.
Network interaction analysis on the microbial taxonomic community at the species level is deciphered ( Fig. 10) using Cytoscape where red hexagon shape nodes represent the sample, circular nodes in green represent species in samples V3 and V4, Cyan include species in V3 sample and yellow depicts species in V4 sample. Network analysis indicates all the similar species are very closely clustered together in V3 and V4 region that are visualized in green color. The similar species are as follows Actinomycetales, Pseudonocardia, Rubrobacter, Solirubrobacterales, Streptomyces, Saccharopolyspora, Actinomycetospora, Kibdelosporangium, Pseudonocardiaceae, Amycolatopsis, Streptococcus, Solirubrobacteraceae, JG30-KF-CM45, AKYG1722, iii1-15, Nocardiaceae, Ellin6529, Enterobacteriaceae, Clostridiaceae, Ruminococcaceae, Lachnospiraceae, Ruminococcus and Ellin6075 respectively.

Overall functional Insight Prediction
Predicted insights based on functional category include annotated proteins, unknown proteins, and ribosomal RNA genes which range in 0.28%, 0.13% and 99.59% (in V3 sample) respectively. Similarly, in V4, rRNA genes fall in a high percentage level of 99.06% with annotated protein being 0.22% and unknown protein of 0.72%. Functional analysis with V3 sample did not map with any highest functional category in COG (Clusters of Orthologous Groups), NOG (Non-supervised Orthologous Groups) and KO (KEGG Orthology) category whereas with subsystems-based annotation using SEED database showed only Carbohydrate metabolism in their classi cation. But in the case of V4 sample, subsystem annotation predicted two functions associated with Amino acid and its derivatives and Cofactors, Vitamins, Prosthetic Groups, Pigments functions. Based on KO, the function is attributed to metabolism and with COG it is involved in metabolism, cellular processes, and signaling.

Discussion
Microorganisms' presence inside the cave systems are commonly oligotrophic or chemolithotrophic in nature and impose precise nutrients for their growth. Hence, it is very di cult to isolate some rare or speci c bacteria from these sources (Portillo et al., 2008). In our study, the maximum number of bacteria was obtained from SCA media (n = 92; 39.1%) followed by ISP7 (n = 57; 24.2%), ISP5 (n = 26; 11.0%), LB In this study, we have found signi cant antimicrobial activity in cave bacterial isolates against a grampositive and gram-negative bacterial pathogen. In our study, out of 235 isolates, 136 (57.87%) strains showed signi cant antimicrobial potential at least ve of the six tested pathogens whereas 48 (20.4%) isolates exhibited antimicrobial ability against all tested pathogens. This nding is similarly reported by Tomova et al. (2013) ., 2011). The phylogenetic analysis of community study indicated that the tree has divided into two major groups i.e. Bacteria and Archaea group. Among them, the bacteria group is the largest group and is divided into three another sub-groups such as FCB group; PVC group and Terrabacteria group. All the bacterial strains were clustered within these groups in our study. All the strains epitomize a novel potential isolate in caves biodiversity that indicates us isolated bacterial population properties to having signi cant discover a new micro-organism from the cave ecosystem . Moreover, most of the isolated bacteria showed signi cant antimicrobial potential against bacterial pathogens. Interestingly, few researchers reported that isolated microbiota from cave ecosystem could be a potential source to discover new microorganisms and antimicrobial agents Yasir, 2018) having wider applicability in health management of crop, animal and human.

Conclusion
This study reveals that, cave ecosystem is a unique source of endemic and extremophilic microorganisms. Diversity can be explored by both culture-based and culture-independent methods.
However, culture dependent methods revealed limited 235 bacterial isolates by temperature treatments viz heat, cold and normal. Both extremes, cold and heat treatment led to recovery of the highest bacterial population indicating the dominance of extremophiles Actinobacteria. Study demonstrates that such microbial communities are having antimicrobial potential with biosynthetic genes like PKS type I, PKS type II and NRPS genes. Culture independent, community-based analysis using the sequencing of both hyper-variable regions V3 and V4 also revealed the dominance of phylum Actinobacteria. In comparative analysis, subsystem annotation using the V4 region was found to be more informative for cave metagenomes functional prediction associated with amino acid and its derivatives like cofactors, vitamins, prosthetic groups and pigments functions. AMP evaluation against six microbial pathogen species revealed that cave microbial communities can be an immensely valuable potential source of microbial genomic resources with higher AMP activity. The microbial panel indicates that these ndings are promising for new AMP molecules having highly diverse applications in sectors of agriculture, domestic animals and human health. All authors have gone through the manuscript and approve its submission.

Availability of data and material
All metagenomics data is available at the NCBI gene bank database under the BioProject: PRJNA489154, BioSample: SAMN09949180 and SRA: SRR7867915.

Competing interests
The authors declare that they have no con ict of interest.