OTU clustering and taxonomic assignments
The bacterial and fungal sequencing data were analysed using Quantitative Insights into Microbial Ecology (QIIME v 1.9.1, http://qiime.org/) and pipeline for analyses of fungal internal transcribed spacer (PIPITS v2.3, https://anaconda.org/bioconda/pipits) softwares, respectively. Paired end reads of approximately 300bp were obtained which were analysed for quality using FastQC tool v 0.11.7. (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/). A wide range of information related to the quality profile of the reads: basic statistics, GC content, over-abundance of adaptors and over-represented sequence was determined by the FastQC report. Raw sequencing reads containing adaptors and primer sequences were quality trimmed with the help of Cutadapt tool (http://code.google.com/p/cutadapt/)
Fastq-Join tool (https://anaconda.org/bioconda/fastq-join) was used to convert paired-end reads into longer contigs of V3-V4 consensus sequence region. This tool operates by finding the overlap for each pair of reads and combines them into a single read. These reads were de-replicated to combine identical tags into unique sequences for the construction of consensus quality profiles and identified sequencing errors from the samples removed. Singletons and Chimeric sequences, caused by the hybridization of DNA fragments from various species, were filtered out using the parameter reference_chimera_detection default implemented in the QIIME.
High quality bacterial and fungal sequences were binned/clustered into operational taxonomic units (OTUs) by UCLUST and RDP classifier method with reference to latest Green gene database (http://greengenes.lbl.gov) and UNITE database (https://unite.ut.ee) in QIIME and PIPITS respectively using a similarity threshold of 97%. Finally, non-redundant and representative OTUs obtained were classified up to species level (wherever possible) followed by individual sample quantification.