The human genome is the most studied and best characterized of all genomes by far, and much of this research was possible thanks to the excellent reference genome that’s freely available to all scientists. A good reference provides a generally accepted standard against which the genetic code of every individual human being can be compared.
The most recent build of this reference is poetically called GRCh38, and is greatly improved over its predecessor version 19, in part thanks to Bionano data. Bionano’s de novo genome maps allowed for the correction of errors in order and orientation in reference hg19. Yet this reference is unfinished, and doesn’t completely reflect the diversity found in human genomes across the globe. For example, 12.8 megabasepairs (Mbp) of novel sequences not in GRCh38 were detected in the Chinese HX1 assembly, which was built using Bionano map as well.
In a new paper in bioRxiv, a team from Uppsala University in Sweden used long-read sequencing and Bionano mapping to de novo assemble two Swedish genomes. Lo and behold, these Swedish genomes also contain over 10 Mbp of novel sequences, 6 Mbp of which are shared with the Chinese HX1.
Now, how do you find out where in the genome these novel sequences map, if they aren’t present in the reference? That’s where Bionano mapping came in. The team created two independent Bionano assemblies using two of our nicking endonucleases, and used those maps to anchor the novel sequence contigs into the reference genome. Since Bionano images up to megabase size molecules, Bionano maps can span both the novel sequences AND the parts of the genome already in the reference. (this process would be even easier with our new DLS chemistry, which assembles entire chromosome arms in single maps!)
The novel sequences that overlap with the Chinese genome mapped largely to chromosomes 13, 14, 21, and 22, and are mainly localized to centromeric and telomeric regions – repetitive areas that can’t easily be analyzed by sequencing. Chromosome 17 seems to be the Swedish chromosome, in that many novel Swedish sequences that aren’t found in HX1 map specifically there.
The figure above this post shows examples of chromosomal regions where a high amount of novel sequences were detected. The two plots to the left show the localization of 3-way overlap sequences (found in both Swedish genomes and HX1) near the centromeric regions of chr14 and chr21. The top right panel displays a region on chr17 where an excess of novel sequences found only in both Swedish genomes could be anchored. The bottom left panel shows novel sequences detected only in the two males (one Swedish genome and HX1) that could be anchored to regions close to the telomere of chromosome Y.
To find out more about how Swedes differ from many of us, check out the publication in bioRxiv.