1.Sample collection
Soil samples were collected from tropical dry evergreen forest(Vallam Reserve forest) in Chengalpattu, the state of Tamil Nadu in India
2. Sequencing preparation
Total genome DNA from samples was extracted using CTAB/SDS method. DNA concentration and purity was monitored on 1% agarose gels. According to the concentration, DNA was diluted to 1ng/µL using sterile water. 16S rRNA genes of distinct regions (16SV3-V4) were amplified used specific primer with the barcode. All PCR reactions were carried out with Phusion® High-Fidelity PCR Master Mix (New England Biolabs). Mix same volume of 1X loading buffer (contained SYBR green) with PCR products and operate electrophoresis on 2% agarose gel for detection. Samples with bright main strip between 400-450bp were chosen for further experiments. PCR products was mixed in equidensity ratios. Then, mixture PCR products was purified with Qiagen Gel Extraction Kit (Qiagen, Germany). The libraries generated with NEBNext® UltraTM DNA Library Prep Kit for Illumina and quantified via Qubit and Q-PCR, would be analysed by Illumina platform.
3. Sequencing data processing and analysis
Paired-end reads were assigned to samples based on their unique barcode and truncated by cutting off the barcode and primer sequence. Paired-end reads were merged using FLASH (V1.2.7, http://ccb.jhu.edu/software/FLASH/) [5], a very fast and accurate analysis tool, which was designed to merge paired-end reads when at least some of the reads overlap the read generated from the opposite end of the same DNA fragment, and the splicing sequences were called raw tags. Quality filtering on the raw tags were performed under specific filtering conditions to obtain the high-quality clean tags[6] according to the Qiime (V1.7.0, http://qiime.org/scripts/split_libraries_fastq.html)[7] quality controlled process. The tags were compared with the reference database (Gold database, http://drive5.com/uchime/uchime_download.html) using UCHIME algorithm (UCHIME Algorithm, http://www.drive5.com/usearch/manual/uchime_algo.html)[8] to detect chimera sequences (http://www.drive5.com/usearch/manual/chimera_formation.html). And then, the chimera sequences were[9] removed. Then the Effective Tags finally obtained. Sequences analysis was performed by Uparse software (Uparse v7.0.1001 http://drive5.com/uparse/)[10] using all the effective tags. Sequences with ≥ 97% similarity were assigned to the same OTUs. Representative sequence for each OTU was screened for further annotation. For each representative sequence, Mothur software was performed against the SSUrRNA database of SILVA Database (http://www.arb-silva.de/)[11] for species annotation at each taxonomic rank (Threshold:0.8 ~ 1)[12] (kingdom, phylum, class, order, family, genus, species). To get the phylogenetic relationship of all OTUs representative sequences, MUSCLE[13] (Version 3.8.31, http://www.drive5.com/muscle/) was utilised to compare multiple sequences rapidly. OTUs abundance information were normalized using a standard of sequence number corresponding to the sample with the least sequences. Subsequent analysis of alpha diversity was performed basing on this output normalized data. Alpha diversity is applied in analyzing complexity of species diversity for a sample through 6 indices, including Observed-species, Chao1, Shannon, Simpson, ACE, Good’s-coverage. All these indices for the samples were calculated with QIIME (Version 1.7.0) and displayed with R software (Version 2.15.3).