Electronic Theses and Dissertation Database
Library Home  |  ` Library Catalog  |  ETD Home  |  Browse ETDs  |  Search ETDs  |  ETD Resources

Title page for ETD etd-04202008-221608


Type of Document Dissertation
Author ALTUN, GULSAH
Author's Email Address galtun@student.gsu.edu
URN etd-04202008-221608
Title MACHINE LEARNING AND GRAPH THEORY APPROACHES FOR CLASSIFICATION AND PREDICTION OF PROTEIN STRUCTURE
Degree Ph.D.
Department Computer Science
Advisory Committee
Advisor Name Title
Robert W. Harrison Committee Chair
Yi Pan Committee Co-Chair
Alexander Zelikovsky Committee Member
Phang C. Tai Committee Member
Keywords
  • protein structure prediction
  • feature selection
  • support vector machines
  • graph theory
  • machine learning
  • algorithm
Date of Defense 2008-03-28
Availability unrestricted
Abstract
Recently, many methods have been proposed for the classification and prediction problems in bioinformatics. One of these problems is the protein structure prediction. Machine learning approaches and new algorithms have been proposed to solve this problem. Among the machine learning approaches, Support Vector Machines (SVM) have attracted a lot of attention due to their high prediction accuracy. Since protein data consists of sequence and structural information, another most widely used approach for modeling this structured data is to use graphs. In computer science, graph theory has been widely studied; however it has only been recently applied to bioinformatics. In this work, we introduced new algorithms based on statistical methods, graph theory concepts and machine learning for the protein structure prediction problem. A new statistical method based on z-scores has been introduced for seed selection in proteins. A new method based on finding common cliques in protein data for feature selection is also introduced, which reduces noise in the data. We also introduced new binary classifiers for the prediction of structural transitions in proteins. These new binary classifiers achieve much higher accuracy results than the current traditional binary classifiers.
Files
  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  altun_gulsah_200805_phd.pdf 885.56 Kb 00:04:05 00:02:06 00:01:50 00:00:55 00:00:04

Browse All Available ETDs by ( Author | Department )

Click here to send a comment to ETD Support