SSPACE-LongRead: scaffolding with long reads
Introduction
We are happy to say that SSPACE is ready for dealing with the PacBio long reads for scaffolding1.
They proposed a novel hybrid assembly methodology that aims to scaffold pre-assembled contigs in an iterative manner using PacBio RS long read information as a backbone.
The SSPACE-LongRead software which is designed to upgrade incomplete draft genomes using single molecule sequences. We conclude that the recent advances of the PacBio sequencing technology and chemistry, in combination with the limited computational resources required to run our program, allow to scaffold genomes in a fast and reliable manner.
Algorithm
As the flowchart below.
- First, alignment of long reads against the pre-assembled contigs (or scaffolds);
- Then, computate contig linkage from the alignment order;
- Last, scaffold contigs into scaffolds.
Results
As their results show that
Results: … On a test set comprising six bacterial draft genomes, assembled using either a single Illumina MiSeq or Roche 454 library, we show that even a 50× coverage of uncorrected PacBio RS long reads is sufficient to drastically reduce the number of contigs. Comparisons to the AHA scaffolder indicate our strategy is better capable of producing (nearly) complete bacterial genomes.
References:
1:Marten Boetzer and Walter Pirovano. SSPACE-LongRead: scaffolding bacterial draft genomes using long read sequence information. BMC Bioinformatics 2014, 15:211