Date of Award

Fall 2012

Project Type


Program or Major

Computer Science

Degree Name

Master of Science

First Advisor

R Daniel Bergeron


In the past, genotyping (determining a set of alleles in an organism) has been an extremely challenging process. The time, monetary, and technology demands of the task have limited genotype data to a small variety of scientific model organisms with the capacity to conduct genetic crosses. New sequencing technology from companies such as NimbleGen, however, can generate custom organism-specific microarrays at relatively low cost. The combination of these arrays and the knowledge of species' genome-wide SNPs allow genotype experiments, such as generation maps, QTL studies, and natural population variation studies, to be conducted on virtually any organism. Although the NimbleGen technology can create appropriate DNA information, there has been no software that can use this data for custom array-based genotyping.

This thesis describes a data pipeline that uses custom DNA microarrays to genotype organisms. The pipeline simplifies the genotyping process, and users can easily customize and run the tool. The pipeline's performance is improved by exploiting parallel aspects of the microarray data, which reduces the genotyping process from days and weeks to minutes and hours. We demonstrate that the pipeline is an effective tool for genotyping custom microarrays across a large number of loci, and describe the effects of user-controlled parameters.