Date of Award

Winter 2017

Project Type

Thesis

Program or Major

Computer Science

Degree Name

Master of Science

First Advisor

R. Daniel Bergeron

Second Advisor

Philip J Hatcher

Third Advisor

W. Kelley Thomas

Abstract

Whole metagenome shotgun sequencing is a powerful approach for assaying many aspects of microbial communities, including the functional and symbiotic potential of each contributing community member. The research community currently lacks tools that efficiently align DNA reads against protein references, the technique necessary for constructing functional profiles. This thesis details the creation of PALADIN – a novel modification of the Burrows-Wheeler Aligner that provides orders-of-magnitude improved efficiency by directly mapping in protein space. In addition to performance considerations, utilizing PALADIN and associated tools as the foundation of metagenomic pipelines also allows for novel characterization and downstream analysis.

The accuracy and efficiency of PALADIN were compared against existing applications that employ nucleotide or protein alignment algorithms. Using both simulated and empirically obtained reads, PALADIN consistently outperformed all compared alignment tools across a variety of metrics, mapping reads nearly 8,000 times faster than the widely utilized protein aligner, BLAST. A variety of analysis techniques were demonstrated using this data, including detecting horizontal gene transfer, performing taxonomic grouping, and generating declustered references.

Share

COinS