• Home/
  • The genome sequence of the European turtle dove, Streptopelia turtur Linnaeus

The genome sequence of the European turtle dove, Streptopelia turtur Linnaeus

Wellcome Open Research 2021
Dunn J. et all



















We present a genome assembly from an individual female Streptopelia turtur (the European turtle dove; Chordata; Aves; Columbidae). The genome sequence is 1.18 gigabases in span. The majority of the assembly is scaffolded into 35 chromosomal pseudomolecules, with the W and Z sex chromosomes assembled.


Streptopelia turtur, European turtle dove, genome sequence, chromosomal

Species taxonomy

Eukaryota; Metazoa; Chordata; Aves; Columbiformes; Columbidae; Streptopelia; Streptopelia turtur Linnaeus 1758 (NCBI:txid177155).


The European turtle dove, Streptopelia turtur, breeds throughout Europe, Central Asia, the Middle East and North Africa, overwintering in north Sub-Saharan Africa. Populations in the Atlantic archipelago of Britain and Ireland are primarily located in southern and eastern England. S. turtur populations are in rapid decline in the UK, having fallen by 98% between 1970 and 2018, making them critically endangered; they are also vulnerable to global extinction (Burns et al., 2020). Several causes have been put forward for this collapse in population. Changes in farming practices and agricultural intensification in the UK have reduced the availability of wild plant seeds, increasing the reliance of S. turtur on anthropogenic seed sources (Browne & Aebischer, 2003); a negative association between nestling condition and consumption of seeds from anthropogenic sources has been reported, although this association was positive for adult birds (Dunn et al., 2018). Additionally, infection with the protozoan parasite Trichomonas gallinae has been identified as a cause of death in adults and nestlings (Stockdale et al., 2015). The length of breeding seasons and the number of breeding attempts of S. turtur have markedly reduced, meaning that fewer young are hatched each year (Browne & Aebischer, 2004). Large populations of migrating birds are also hunted in Mediterranean countries, such as France, Spain and Morocco, compounding this decline in numbers. The genome sequence described here will be of utility to researchers assessing the vulnerability of S. turtur to parasitic infections, and to those interested in population genomics and supporting the numbers of this declining species.

Genome sequence report

The genome was sequenced from a blood sample collected from a single live female S. turtur during routine population health checks. A total of 34-fold coverage in Pacific Biosciences single-molecule long reads (N50 22 kb) and 45-fold coverage in 10X Genomics read clouds (from molecules with an estimated N50 of 34 kb) were generated. Primary assembly contigs were scaffolded with chromosome conformation Hi-C data. The Hi-C scaffolds were validated using BioNano Genomics long range restriction maps (106-fold effective coverage). Manual assembly curation corrected 54 missings/misjoins and removed 1 haplotypic duplication, reducing the scaffold number by 23.59%, increasing the scaffold N50 by 19.08% and decreasing the assembly length by 0.01%. The final assembly has a total length of 1.18 Gb in 357 sequence scaffolds with a scaffold N50 of 81.4 Mb (Table 1). The majority, 98.3%, of the assembly sequence was assigned to 35 chromosomal-level scaffolds representing 33 autosomes (numbered by synteny to the chicken, Gallus gallus domesticus: GCA_000002315.5), and the W and Z sex chromosomes (Figure 1Figure 4Table 2). The assembly has a BUSCO v5.1.2 (Simão et al., 2015) completeness of 95.7% using the aves_odb10 reference set. While not fully phased, the assembly deposited is of one haplotype. Contigs corresponding to the second haplotype have also been deposited.

Project accession d…
Assembly …bStrTur1.2

Table 1. Genome data for Streptopelia turtur, bStrTur1.2.


Figure 1. Genome assembly of Streptopelia turtur, bStrTur1.2: metrics.

The BlobToolKit Snailplot shows N50 metrics and BUSCO gene completeness. An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/Streptopelia%20turtur/dataset/CABFKC02/snail.


Figure 2. Genome assembly of Streptopelia turtur, bStrTur1.2: GC coverage.

BlobToolKit GC-coverage plot. An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/Streptopelia%20turtur/dataset/CABFKC02/blob.


Figure 3. Genome assembly of Streptopelia turtur, bStrTur1.2: cumulative sequence.

BlobToolKit cumulative sequence plot. An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/Streptopelia%20turtur/dataset/CABFKC02/cumulative.


Figure 4. Genome assembly of Streptopelia turtur, bStrTur1.2: Hi-C contact map.

Hi-C contact map of the bStrTur1 assembly, visualized in HiGlass (Kerpedjiev et al., 2018).


Table 2. Chromosomal pseudomolecules in the genome assembly of Streptopelia turtur, bStrTur1.2.


The European turtle dove specimen was taken from blood collected from a live bird during routine health checks of populations in Marks Tey, Essex, UK (latitude 51.874N, longitude 0.729E; grid reference TL8823). The sample was taken under Home Office (Animals Scientific Procedures Act, ASPA) licence number PPL 7007641); the bird was caught and handled under a British Trust for Ornithology ringing license.

DNA was extracted using an agarose plug extraction from a blood sample following the BioNano Genomics Prep Blood and Cell Culture DNA Isolation Protocol. Pacific Biosciences (PacBio) CLR long read and 10X Genomics read cloud sequencing libraries were constructed according to manufacturers’ instructions. Sequencing was performed by the Scientific Operations core at the Wellcome Sanger Institute on Pacific Biosciences SEQUEL I and Illumina HiSeq X instruments. Ultra-high molecular weight DNA was extracted using the BioNano Genomics Prep Animal Tissue DNA Isolation Soft Tissue Protocol and assessed by pulsed field gel and Qubit 2 fluorimetry. DNA was labeled for BioNano Genomics optical mapping following the BioNano Genomics Prep Direct Label and Stain (DLS) Protocol, and run on one Saphyr Optical Instrument chip flowcell (BioNano Genomics). Hi-C data were generated using the Arima Hi-C kit v1 by Arima Genomics, San Diego, USA, using the Illumina HiSeqX sequencing instrument.

Assembly was carried out following the Vertebrate Genome Project pipeline v1.6 (Rhie et al., 2020) with Falcon-unzip (Chin et al., 2016); haplotypic duplication was identified and removed with purge_haplotigs (Roach et al., 2018) and a first round of scaffolding carried out with 10X Genomics read clouds using scaff10x. Hybrid scaffolding was performed using the BioNano Genomics DLE-1 data and BioNano Solve. Scaffolding with Hi-C data (Rao et al., 2014) was carried out with SALSA2 (Ghurye et al., 2019). The Hi-C scaffolded assembly was polished with arrow using the PacBio data, then polished with the 10X Genomics Illumina data by aligning to the assembly with longranger align, calling variants with freebayes (Garrison & Marth, 2012) and applying homozygous non-reference edits using bcftools consensus. Two rounds of the Illumina polishing were applied. The assembly was checked for contamination and corrected using the gEVAL system (Chow et al., 2016) as described previously (Howe et al., 2021). Manual curation was performed using evidence from BioNano Genomics (using the BioNano Access viewer), using HiGlass and PretextFigure 1Figure 3 and BUSCO v5.1.2 scores were generated using BlobToolKit (Challis et al., 2020). Table 3 gives version numbers of the software tools used in this work.


Table 3. Software tools used.

Data availability

European Nucleotide Archive: Streptopelia turtur (European turtle dove) genome assembly, bStrTur1. Accession number PRJEB32724.

The genome sequence is released openly for reuse. The S. turtur genome sequencing initiative is part of the Wellcome Sanger Institute’s “25 genomes for 25 years” project. It is also part of the Vertebrate Genome Project (VGP) ordinal references programme and the Darwin Tree of Life (DToL) project. All raw data and the assembly have been deposited in the ENA. The genome will be annotated and presented through the Ensembl pipeline at the European Bioinformatics Institute. Raw data and assembly accession identifiers are reported in Table 1.

This website stores cookies on your computer. These cookies are used to collect information about how you interact with our website and allow us to remember you. We use this information in order to improve and customize your browsing experience and for analytics and metrics about our visitors both on this website and other media. To find out more about the cookies we use, see our Privacy Policy.