
Type of Document Dissertation Author He, Jingwu Author's Email Address jingwu@cs.gsu.edu URN etd-08292006-163900 Title Algorithms for Computational Genetics Epidemiology Degree Ph.D. Department Computer Science Advisory Committee
Advisor Name Title Dr. Alex Zelikovsky Committee Chair Dr. Anu Bourgeois Committee Member Dr. Ion Mandoiu Committee Member Dr. Yi Pan Committee Member Keywords
- Tagging
- Phasing
- Haplotype
- Genotype
- SNP
Date of Defense 2006-05-15 Availability unrestricted Abstract The most intriguing problems in genetics epidemiology are to predict geneticdisease susceptibility and to associate single nucleotide polymorphisms (SNPs) with
diseases. In such these studies, it is necessary to resolve the ambiguities in genetic
data. The primary obstacle for ambiguity resolution is that the physical methods for
separating two haplotypes from an individual genotype (phasing) are too expensive.
Although computational haplotype inference is a well-explored problem, high error
rates continue to deteriorate association accuracy. Secondly, it is essential to use a
small subset of informative SNPs (tag SNPs) accurately representing the rest of the
SNPs (tagging). Tagging can achieve budget savings by genotyping only a limited
number of SNPs and computationally inferring all other SNPs. Recent successes in
high throughput genotyping technologies drastically increase the length of available
SNP sequences. This elevates importance of informative SNP selection for
compaction of huge genetic data in order to make feasible fine genotype analysis.
Finally, even if complete and accurate data is available, it is unclear if common
statistical methods can determine the susceptibility of complex diseases.
The dissertation explores above computational problems with a variety of
methods, including linear algebra, graph theory, linear programming, and greedy
methods. The contributions include (1)significant speed-up of popular phasing tools
without compromising their quality, (2)stat-of-the-art tagging tools applied to
disease association, and (3)graph-based method for disease tagging and predicting
disease susceptibility.
Files
Filename Size Approximate Download Time (Hours:Minutes:Seconds)
28.8 Modem 56K Modem ISDN (64 Kb) ISDN (128 Kb) Higher-speed Access he_jingwu_200612_phd.pdf 932.05 Kb 00:04:18 00:02:13 00:01:56 00:00:58 00:00:04