The Read-Optimized Burrows-Wheeler Transform
The advent of high-throughput sequencing has resulted in massive genomic datasets, some consisting of assembled genomes but others consisting of raw reads. We consider how to reduce the amount of space needed to index a set of reads, in particular how to reduce the number of runs in the Burrows-Wheeler Transform (BWT) that is the basis of FM-indexing. The best current fully-functional index for repetitive collections (Gagie et al., SODA 2018) uses space proportional to this number.
READ FULL TEXT 
  
  
     share
 share