RNA Sequencing and Transcriptomic Analysis
Nucleic acids were isolated from frozen cell pellets using a modified
CTAB protocol (Possmayer et al. 2011). RNA concentration was
determined using a Nanodrop2000 (Thermo-Fisher Scientific) and integrity
was assessed with a 2100 Bioanalyzer (Agilent Technologies, USA). RNA
library preparation and sequencing were performed by Genome Quebec
(Montreal, QC, Canada). Libraries were generated from 250 ng of total
RNA. Poly-A mRNA was isolated with the NEBNext Poly(A) mRNA Magnetic
Isolation kit (NEB, USA). Reverse transcription was performed with the
NEBNext RNA First Strand Synthesis kit (NEB), and second strand
synthesis with the NEBNext Ultra Directional RNA Second Strand Synthesis
kit (NEB). Libraries were prepared using the NEBNext Ultra II Library
Prep Kit for Illumina (NEB) and were sequenced with 100 base paired-end
reads on an Illumina HiSeq4000 platform (Illumina, San Diego, USA).
For gene expression analysis, the RNA-Seq reads were mapped to the
UWO241 assembled genome (Zhang et al. 2021a) (Accession number
GCA_016618255.1) using HISAT2 (Kim, Langmead & Salzberg 2015) and
counted against the predicted gene models using
HTSeq-count v0.11.3 (Anders, Pyl &
Huber 2015). Stringtie v2.1.5 was used to generate expression estimates
from the SAM/BAM files created by HISAT2 (Pertea, Kim, Pertea, Leek &
Salzberg 2016). Samtools v1.11 was used to read and write Illumina
RNA-Seq alignments in the SAM and BAM files. The total number of aligned
reads were normalized by gene length and sequencing depth and expressed
as Fragments Per Kilobase of transcript per Million mapped reads (FPKM)
as a measure of the expression level for each gene. Differentially
expressed genes (DEGs) were determined by Ballgown v2.22.0 (Perteaet al. 2016) and edgeR v2.22.0 (Robinson, McCarthy & Smyth
2010). Genes were sorted according to their log2(read
counts)-transformed values. The Generally Applicable Gene-set Enrichment
(GAGE v2.40.1) package in R (Luo, Friedman, Shedden, Hankenson & Woolf
2009) was used to perform pathway analysis based on genes that were
assigned Chlamydomonas Entrez IDs. The parameter “same.dir” in GAGE
was set in True and significantly regulated pathways were defined as
those enriched sets of genes with a p-value <0.05. To generate
the heatmap expression profiles of HSPs, hierarchical clustering using
the Euclidean distance method was performed within each subfamily using
the ComplexHeatmap R package. Venn diagrams were constructed using an
online tool (http://bioinformatics.psb.ugent.be/webtools/Venn/).