Genotyping and population assignment
A total of 3253 adult individuals captured and sampled for blood during
the period 1998 – 2013 were successfully genotyped with our custom
house sparrow Affymetrix Axiom 200,000 SNP array (Lundregan et al.,
2018). Based on the MonoHigh and PolyHigh quality criteria of
Affymetrix, 185,587 SNPs were passed on to further quality control,
where potential duplicates (identity by state above 0.98) and low
quality samples (genotyping rate < 0.90) were removed from the
data set. Moreover, loci with potentially high level of genotyping
errors (SNP call rate < 95%; Mendelian error rate based on
parental relationships > 5%) or low minor allele frequency
(MAF < 0.01) were also excluded. In total, 3116 individuals
and 183,145 SNPs passed the overall quality check (Lundregan et al.,
2018). In this data set, any missing genotypes (0.76% of the in total
570,679,820 genotypes) were imputed using linkimpute (Money et al.,
2015) to improve statistical power in our GWAS. Finally, a
metapopulation-level pedigree was constructed based on parentage
analyses using individual high-density SNP-genotype data (Niskanen et
al., 2020). Both parents were known for 52.7% of the individuals in the
pedigree, one parent was known for 25.0% of the individuals, and the
rest of the individuals did not have any parental information in the
pedigree.
High-quality information on natal dispersal was available for 2741 adult
birds present on one of the eight main study islands during the years
1998 – 2013 either from mark-recapture or genetic assignment
information (Saatoglu et al., 2021). For the remaining 375 individuals
that were successfully genotyped, we had information on which island
they were first recorded (either as a fledged juvenile in the autumn or
a 1 year old recruit during summer). These individuals, as well as
individuals which had a natal island not among one of our eight main
study islands (N = 98), and individuals which could only be assigned to
a group of natal islands and not a specific one out of our 8 main study
islands because the SNP genotyping of birds from the farm and non-farm
islands had been initiated in different years (N = 41; see Saatoglu et
al., 2021), were removed from the phenotype data set. Thus, phenotypic
data on dispersal for a total of 2602 individuals were used in the
animal model analyses and GWAS.