
Type of Document Dissertation Author Mao, Weidong Author's Email Address wmao@cs.gsu.edu URN etd-07172006-174446 Title Prediction of Genetic Susceptibility to Complex Diseases Degree Ph.D. Department Computer Science Advisory Committee
Advisor Name Title Alex Zelikovsky Committee Chair Andrey Perelygin Committee Member Anu Bourgeois Committee Member Robert Harrison Committee Member Keywords
- Complex diseases
- Susceptibility prediction
- Disease association
- Genetic susceptibility
- Phasing
Date of Defense 2006-06-06 Availability restricted Abstract The accessibility of high-throughput biology data brought a great deal of attention to disease association studies. High density maps of single nucleotide polymorphism (SNP's) as well as massive genotype data with large number of individuals and number of SNP's become publicly available. By now most analysis of the new data is undertaken by the statistics community. In this dissertation, we pursue a different line of attack on genetic susceptibility to complex disease that adheres to the computer science community with an emphasis on design rather than analytical methodology.The main goal of disease association analysis is to identify gene variations contributing to the risk of and/or susceptibility to a particular disease. There are basically two main steps in susceptibility: (i) haplotyping of the population and (ii) predicting the genetic susceptibility to diseases. Although there exist many phasing methods for step (i), phasing and missing data recovery for data representing family trios is lagging behind, and most disease association studies are based on family trios. This study is devoted to the problem of assessing accumulated information targeting to predict genotype susceptibility to complex diseases with significantly high accuracy and statistical power. The dissertation proposes two new greedy and integer linear programming based solution methods for step (i). We also proposed several universal and ad hoc methods for step (ii).
The quality of susceptibility prediction algorithm has been assessed using leave-one-out and leave-many-out tests and shown to be statistically significant based on randomization tests. The prediction of disease status can also be viewed as an integrated risk factor. A combinatorial prediction complexity measure has been proposed for case/control studies. The best prediction rate achieved by the proposed algorithms is 69.5% for Crohn's disease and 61.3% for autoimmune disorder, respectively, which are significantly higher than those achieved by universal prediction methods such as Support Vector Machine (SVM) and known statistic methods.
Files
Filename Size Approximate Download Time (Hours:Minutes:Seconds)
28.8 Modem 56K Modem ISDN (64 Kb) ISDN (128 Kb) Higher-speed Access weidong_mao_200608_phd.pdf 642.50 Kb 00:02:58 00:01:31 00:01:20 00:00:40 00:00:03 indicates that a file or directory is accessible from the Georgia State University campus network only.