Introduction
Long-distance migratory species pose distinct challenges to studies of ecology, evolution, and conservation because they occupy different geographical regions throughout the year that can be separated by thousands of kilometers. At each stage in the migratory annual cycle, migrant populations are subject to various stressors that can influence their fitness . As a result, effective conservation efforts require understanding migratory connectivity, defined as the links between different geographic regions used across the annual cycle . In the past 20 years, population genetics has become a well-established means for tracking migratory populations, especially for studies involving large sample sizes or small-bodied individuals . However, the value of genetic markers is often limited by the amount of genetic differentiation in a species and the availability of genetic data from individuals across the annual cycle .
Population assignment methods originated in the early 1980s and 1990s as a means of identifying breeding origins of migratory individuals back to distinct tributaries (in the case of fish) or geographic regions (in the case of bears) . Early methods relied on genetic markers that were limited to identifying only deep phylogeographic breaks within species . In recent years, next generation sequencing has facilitated the screening of a significantly larger number of genetic markers allowing for the delineation of breeding populations at finer spatial scales . Cost-effective delineation of patterns of migratory connectivity was made possible by designing single nucleotide polymorphisms (SNP) assays for a subset of these markers that were particularly useful for population assignment . While recent reductions in the cost of whole genome sequencing have made it possible to directly use low-coverage whole genome sequencing (lcWGS) data to screen migrant samples, the lack of software capable of dealing with the increase in marker number has prevented this method from being used for population assignment (DeSaix et al. in review).
Low-coverage WGS has made sequencing more affordable for non-model organisms by reducing the sequencing effort per individual, however it has distinct challenges. One of these challenges is dealing with low sequencing read depths per individual, which necessitates the use of probabilistic frameworks for genotype calling to account for the uncertainty inherent in the data . Accurate estimates of parameters such as allele frequency can be obtained by prioritizing larger sample sizes of individuals with lower sequencing depth . Guidelines for achieving accurate allele frequency estimation with lcWGS include sequencing individuals at a minimum of 1X coverage or having at least 10 individuals sequenced with a total sequencing depth of at least 10X . To take advantage of lcWGS data for population assignment, DeSaix et al. (in review ) recently developed a software package, WGSassign, that accounts for uncertainty inherent to lcWGS data in population assignment tests. Here, for the first time, we use lcWGS data to assign migrants to their population of origin.
The American Redstart (Setophaga ruticilla ) is an ideal system for evaluating the potential gains in effectiveness achievable by using lcWGS data for population assignment because previous studies using a variety of methods provide a strong foundation for comparisons. The American Redstart is a widely distributed migratory songbird with a breeding distribution across North America and stationary nonbreeding distribution throughout the Caribbean, northern South America, Central America, and Mexico . For several decades, the American Redstart has been a model species for understanding migratory ecology and has been used to elucidate territoriality on the wintering grounds , foraging behavior , habitat selection , and carry-over effects of stressors across the annual cycle . Phylogeographic structure has previously been detected between a small region in the Maritime Provinces, specifically in Newfoundland and New Brunswick in the northeastern portion of the range, and the rest of the continental breeding range using mtDNA . Subsequent analysis of migratory connectivity using mtDNA revealed that Newfoundland breeders overwintered on the islands of Puerto Rico and the Dominican Republic, while continental breeding birds overwintered across the entire nonbreeding range . Stable isotope studies have shown strong migratory connectivity, with eastern breeding birds overwintering in the Caribbean and western breeding birds overwintering in Central America and Mexico , but whether these migratory differences correspond to genetic differentiation has not been tested.
Here we aim to demonstrate the effectiveness of using lcWGS data for population assignment of nonbreeding individual using the American Redstart as a model species. Our main objectives were: 1) Identify population-specific migratory connectivity in the American Redstart using lcWGS data, 2) Assess conservation implications of migratory connectivity by identifying relative abundance and trends in population size, and 3) Provide study design recommendations to facilitate the use of lcWGS data in other population assignment studies. Our results have broad implications for improving our understanding of the ecology and evolution of migratory species through conservation genomics approaches.