OTU clustering and taxonomic assignments
The bacterial and fungal sequencing data were analysed using
Quantitative Insights into Microbial Ecology (QIIME v 1.9.1,
http://qiime.org/) and pipeline for
analyses of fungal internal transcribed spacer (PIPITS v2.3,
https://anaconda.org/bioconda/pipits) softwares, respectively.
Paired end reads of approximately 300bp were obtained which were
analysed for quality using FastQC tool v 0.11.7.
(https://www.bioinformatics.babraham.ac.uk/projects/fastqc/). A
wide range of information related to the quality profile of the reads:
basic statistics, GC content, over-abundance of adaptors and
over-represented sequence was determined by the FastQC report. Raw
sequencing reads containing adaptors and primer sequences were quality
trimmed with the help of Cutadapt tool
(http://code.google.com/p/cutadapt/)
Fastq-Join tool (https://anaconda.org/bioconda/fastq-join) was
used to convert paired-end reads into longer contigs of V3-V4 consensus
sequence region. This tool operates by finding the overlap for each pair
of reads and combines them into a single read. These reads were
de-replicated to combine identical tags into unique sequences for the
construction of consensus quality profiles and identified sequencing
errors from the samples removed. Singletons and Chimeric sequences,
caused by the hybridization of DNA fragments from various species, were
filtered out using the parameter reference_chimera_detection default
implemented in the QIIME.
High quality bacterial and fungal sequences were binned/clustered into
operational taxonomic units (OTUs) by UCLUST and RDP classifier method
with reference to latest Green gene database
(http://greengenes.lbl.gov) and UNITE
database (https://unite.ut.ee) in QIIME and PIPITS respectively
using a similarity threshold of 97%. Finally, non-redundant and
representative OTUs obtained were classified up to species level
(wherever possible) followed by individual sample quantification.