Electronic Theses and Dissertation Database
Library Home  |  ` Library Catalog  |  ETD Home  |  Browse ETDs  |  Search ETDs  |  ETD Resources

Title page for ETD etd-12072004-120854


Type of Document Master's Thesis
Author Panaganti, Shilpa
URN etd-12072004-120854
Title Parallel SVM with Application to Protein Structure Prediction
Degree Master of Science
Department Computer Science
Advisory Committee
Advisor Name Title
Dr. Yi Pan Committee Chair
Dr. Michael Weeks Committee Member
Dr. Rajshekhar Sunderraman Committee Member
Keywords
  • SVMlight
  • OpenMP
  • Pthreads
  • VC dimension
  • SRM
  • and ERM
Date of Defense 2004-11-09
Availability restricted
Abstract
A learning task with thousands of training examples in Support Vector Machine (SVM) demands large amounts of memory and time requirements. SVMlight by Dr. Thorsten Joachims has been implemented in C using a fast optimizing algorithm for handling thousands of such support vectors. SVMlight solves the problem of classification, pattern recognition, regression and learning ranking function. The C code also provides methods for XiAlpha estimation of error rate and precision. Implementing these two methods leads to generalized performance of Support Vector Machine even for computation intensive text classification functions. SVMlight code allows users to define their own kernel functions. The SVMlight software employs an efficient algorithm and minimizes the cost, but it still takes considerable amount of time for computing thousands of support vectors and training examples. This time can be still reduced by parallelizing the code.

In our work we refined the SVMlight code by removing unnecessary iterations and rewriting it as cost efficient. Then we parallelized the code individually using two different types, OpenMP and POSIX Threads shared memory parallelism. The code is parallelized for these two methods on Intel’s C compiler for Linux 7.1 using hyper threading technology. The parallelized code is tested for protein structure prediction. Different types of Protein Sequences are tested on these methods by varying the number of training examples and support vectors. The time consumption and speedup are calculated for both OpenMP and Pthreads. Implementation of OpenMP and Pthreads together showed good increase in speedup.

Files
  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
[GSU] panaganti_shilpa_200412_ms.pdf 573.51 Kb 00:02:39 00:01:21 00:01:11 00:00:35 00:00:03
[GSU] indicates that a file or directory is accessible from the Georgia State University campus network only.

Browse All Available ETDs by ( Author | Department )

Click here to send a comment to ETD Support