Bioinformatics analysis
Raw sequence data underwent QC, trimming, and removal of the synthetic DNA template using FastQC (21), Trimmomatic (22) and bbmap (BBTools v37.28; (23)), respectively. The sequence data that passed QC was imported into the QIIME2 package (v2019.10) (24)) for analysis using the deblur pipeline (25) with the SILVA DB (v128 at 97%, (26)). Following filtering of chimeric sequences and taxonomic assignments, OTUs that were not assigned as bacteria were removed. Due to the nature of the samples (i.e. low biomass), additional filtering was conducted using decontam (27). Finally, the samples were rarefied at a sampling depth of 4,000 using the diversity plugin (diversity core-metrics-phylogenetic). Differences in the relative abundance of microbial taxa between vaginal and fallopian tube samples were assessed using theĀ  R package microeco (28), with statistical significance measured with the Kruskal-Wallis statistical test (with Benjamini-Hochberg adjusted p-value < 0.05 considered significant). Alpha diversity (Shannon index) and beta diversity (Bray-Curtis dissimilarity index) metrices were measured using the QIIME2 diversity plugin (diversity alpha and diversity beta respectively). Comparisons of alpha diversity (Shannon index) between groups of interest (i.e. case vs. control, or vaginal vs. fallopian tube were conducted using the QIIME2 diversity plugin (diversity alpha-group-significance), with the Kruskal-Wallis statistical test (with Benjamini-Hochberg adjusted p-values considered significant at <0.05). Comparisons of beta diversity (Bray-Curtis dissimilarity) between groups of interest (as above) were conducted using the QIIME2 diversity plugin (diversity beta-group-significance), with the PERMANOVA statistical test (using 999 permutations, with Benjamini-Hochberg adjusted p-values considered significant at <0.05). The linear discriminant analysis effect size (LEfSE) biomarker discovery tool (29) [used as part of the R package microeco] was used to identify differentially abundant OTUs, where variation in abundance of taxa (with relative abundance greater than 0.01%) was considered significant where the LDA score was greater than 3 and/or the p-value from the LefSe test was < 0.05.