Klaus Leitner

and 3 more

The integration of a transgene expression construct into the host genome is the initial step for the generation of recombinant cell lines used for biopharmaceutical production. The stability and level of recombinant gene expression in Chinese hamster ovary (CHO) can be correlated to the copy number, its integration site as well as the epigenetic context of the transgene vector. Also, undesired integration events, such as concatemers, truncated and inverted vector repeats, are impacting the stability of recombinant cell lines. Thus, to characterize cell clones and to isolate the most promising candidates it is crucial to obtain information on the site of integration, the structure of integrated sequence and the epigenetic status. Current sequencing techniques allow to gather this information separately but do not offer a comprehensive and simultaneous resolution. In this study, we present a fast and robust nanopore Cas9-targeted sequencing (nCats) pipeline to identify integration sites, the composition of the integrated sequence as well as its DNA methylation status in CHO cells that can be obtained simultaneously from the same sequencing run. A Cas9-enrichment step during library preparation enables targeted and directional nanopore sequencing with up to 724x median on-target coverage and up to 153 Kb long reads. The data generated by nCats provides sensitive, detailed and correct information on the transgene integration sites and the expression vector structure, which could only be partly produced by traditional Targeted Locus Amplification-Seq data. Moreover, with nCats the DNA methylation status can be analyzed from the same raw data without prior DNA amplification.