3.1 Genome assembly and annotation
We used PacBio sequencing and Hi-C (Burton et al., 2013) to create a 3.2 Gb (2n =14) chromosome-level A. flavipes nuclear genome assembly with a contig N50 of 51.8 Mb and scaffold N50 of 636.7 Mb (Table 1 and Figure 1 ). The seven chromosomes ranged in size from 86.6 to 727.3 Mb (Table 2 ) and covered ~99.7% of the assembly. We also generated A. flavipes reference-based assemblies of A. arktos , A. argentus , and M. melanurus from short-insert whole-genome sequencing reads. Most A. flavipes genome regions had a GC-depth of about 50×, while a small peak at 25× depth revealed heterozygous sex chromosomes (Figure S5 ). Hi-C was able to recover a ~86.5 Mb X chromosome (chromosome 7), while the Y chromosome could not be obtained (see Supplemental Methods). This is unsurprising, as marsupial Y chromosomes are small (~10-12 Mb; akin to microchromosomes of birds), repeat-rich, and refractory to assembly (Deakin & O’Neill, 2020). The chromosomes of A. flavipes were highly homologous to the related Tasmanian devil (Sarcophilus harrisii ) (Figure 1C ), supporting the quality of the assemblies.
The GC content of A. flavipes (~36%) (Table 1 ) is similar to the brown antechinus (A. stuartii ) and S . harrisii . As indicated by 17-mer frequency analysis (Figure S1 ), the A. flavipes genome is repeat rich. Repetitive elements account for 51.8% (~1.7 Gb) of the assembly, with long interspersed elements (LINEs; ~45%), long terminal retrotransposons (LTRs; 15.4%) and short interspersed elements (SINEs; 6.6%) being the major classes of transposable elements (TEs) (Tables S5 and S6 ). We annotated 24,708 A. flavipes protein-coding genes (82.2% supported by transcriptome data from 13 tissues). The A. flavipes reference assembly obtained a BUSCO (Benchmarking Universal Single-Copy Orthologs) (Seppey et al., 2019) genome completeness score of 92.4%) – comparable to A. stuartii (92.2% and 92.3% for a female and male assembly, respectively) (Brandies et al., 2020a), the koala (Phascolarctos cinereus ; 93.9%) (Johnson et al., 2018), and the S . harrisii (91.6%; 2019 assembly mSarHar1.1) (Table S7 ). The scores of the three dasyurid reference-based assemblies, here used for phylogenetic and demography analyses, ranged from 74.2% to 90.0% recovered complete BUSCO genes (Table S8 ).