
Type of Document Master's Thesis Author Panaganti, Shilpa URN etd-12072004-120854 Title Parallel SVM with Application to Protein Structure Prediction Degree Master of Science Department Computer Science Advisory Committee
Advisor Name Title Dr. Yi Pan Committee Chair Dr. Michael Weeks Committee Member Dr. Rajshekhar Sunderraman Committee Member Keywords
- SVMlight
- OpenMP
- Pthreads
- VC dimension
- SRM
- and ERM
Date of Defense 2004-11-09 Availability restricted Abstract A learning task with thousands of training examples in Support Vector Machine (SVM) demands large amounts of memory and time requirements. SVMlight by Dr. Thorsten Joachims has been implemented in C using a fast optimizing algorithm for handling thousands of such support vectors. SVMlight solves the problem of classification, pattern recognition, regression and learning ranking function. The C code also provides methods for XiAlpha estimation of error rate and precision. Implementing these two methods leads to generalized performance of Support Vector Machine even for computation intensive text classification functions. SVMlight code allows users to define their own kernel functions. The SVMlight software employs an efficient algorithm and minimizes the cost, but it still takes considerable amount of time for computing thousands of support vectors and training examples. This time can be still reduced by parallelizing the code.In our work we refined the SVMlight code by removing unnecessary iterations and rewriting it as cost efficient. Then we parallelized the code individually using two different types, OpenMP and POSIX Threads shared memory parallelism. The code is parallelized for these two methods on Intel’s C compiler for Linux 7.1 using hyper threading technology. The parallelized code is tested for protein structure prediction. Different types of Protein Sequences are tested on these methods by varying the number of training examples and support vectors. The time consumption and speedup are calculated for both OpenMP and Pthreads. Implementation of OpenMP and Pthreads together showed good increase in speedup.
Files
Filename Size Approximate Download Time (Hours:Minutes:Seconds)
28.8 Modem 56K Modem ISDN (64 Kb) ISDN (128 Kb) Higher-speed Access panaganti_shilpa_200412_ms.pdf 573.51 Kb 00:02:39 00:01:21 00:01:11 00:00:35 00:00:03 indicates that a file or directory is accessible from the Georgia State University campus network only.