Abstract
Although hash-based approaches to sequence alignment and genome assembly are long established, their utility is predicated on the rapid identification of exact k-mers from a hash-map or similar data structure. We describe how a fuzzy hash-map can be applied to quickly and accurately align a prokaryotic genome to the reference genome of a related species. Using this technique, a draft genome of Mycoplasma genitalium, sampled at 1X coverage, was accurately anchored against the genome of Mycoplasma pneumoniae. The fuzzy approach to alignment, ordered and orientated more than 65% of the reads from the draft genome in under 10 seconds, with an error rate of 1.5%. Without sacrificing execution speed, fuzzy hash-maps also provide a mechanism for error tolerance and variability in k-mer centric sequence alignment and assembly applications.
| Original language | English (Ireland) |
|---|---|
| Title of host publication | 5TH INTERNATIONAL CONFERENCE ON PRACTICAL APPLICATIONS OF COMPUTATIONAL BIOLOGY BIOINFORMATICS (PACBB 2011) |
| Publisher | SPRINGER-VERLAG BERLIN |
| Volume | 93 |
| ISBN (Electronic) | 1867-5662 |
| ISBN (Print) | 1867-5662 |
| Publication status | Published - 1 Jan 2011 |
Authors (Note for portal: view the doc link for the full list of authors)
- Authors
- Healy, J;Chambers, D
Fingerprint
Dive into the research topics of 'Fast and Accurate Genome Anchoring Using Fuzzy Hash Maps'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver