- Discovery Research /
- Genome Assembly
Improve Genome Assemblies
When constructing de novo genomes, short-read or long-read sequencing alone is not sufficient to construct a contiguous and accurate assembly. Next-Generation Sequencing (NGS) technologies are essential for nucleotide-level information but are limited beyond that resolution as the fragmented reads are too short to retain the contiguity needed to make a complete map and overcome repetitive regions within and between genes.
To detect variation and architecture on a larger scale, use Bionano genome mapping, the technology that directly measures the genome with a resolution matching the variation in question—hundreds of kilobases, not hundreds of base pairs. Adding Bionano maps to sequencing data enables one to view a whole genome, with all its features in context and functional relationships described—across megabases.
De novo Bionano genome maps can be integrated with a sequence assembly to:
- order and orient sequence fragments
- identify and correct potential chimeric joins in the sequence assembly
- estimate the gap size between adjacent sequences
To do so, the Bionano Solve software imports the assembly and identifies putative nick sites in the sequence based on the nicking endonuclease-specific recognition site. These in silico maps for the sequence contigs are then aligned to the de novo Bionano genome maps. Genome maps orient contigs and size gaps by bridging across repeats and other complex elements that break NGS assemblies.
Conflicts between the two are identified and resolved, and hybrid scaffolds are generated in which sequence maps are used to bridge Bionano maps and vice versa. Finally, the sequence assembly corresponding to this hybrid scaffold is generated and exported as FASTA and AGP files.
Bionano hybrid scaffolding is agnostic to the sequencing technology used. Recent publications have featured scaffolding of assemblies based on Illumina sequencing alone, PacBio sequencing, 10x Genomics assemblies, NRGene assemblies, Oxford Nanopore sequencing, and combinations of those.
Assembly Conflicts and Resolution
Bionano’s hybrid scaffolding pipeline detects and resolves chimeric joins, which are typically formed when short reads, molecules, or paired-end inserts are unable to span across long DNA repeats. The errors appear as conflicting junctions in the alignment between the Bionano map and NGS assemblies.
When Bionano’s hybrid scaffolding pipeline detects a conflict, it analyzes the single-molecule data that underlies the Bionano map and assesses which assembly is incorrectly formed. If the Bionano map has long molecule support at the conflict junction, the sequence contig is automatically cut, removing the putative chimeric join.
If it does not have strong molecule support, then the Bionano map is automatically cut. Both assemblies must have coverage spanning both sides of a chimeric join to detect and resolve these conflicts.
Validating NGS-Based Assemblies
Bionano genome maps represent a powerful orthogonal validation method for genome contigs and scaffolds in regions simple enough to be assembled by fragment reads. Scaffolding technologies that use synthetic long reads or DNA cross-linking provide some error correction compared to short-read assemblies alone. However, since they are NGS-based, they suffer from most of the same issues plaguing short-read-only assemblies. Only Bionano Next-Generation Mapping provides non-sequencing-based, orthogonal genome structure data in a high-throughput way, allowing for completely independent error correction.