EMPIRICAL TRANSITION PROBABILITY INDEXING GENOME SEQUENCE ALIGNMENT BASED ON CUDA

Han, Dong

View/Open

2016_Han_Dong_Thesis.pdf (1.064Mb)

2016_Han_Dong_Thesis.docx (862.6Kb)

Date

2016-05

Author

Han, Dong

Metadata

Show full item record

Abstract

After Deoxyribonucleic Acid (DNA) was discovered, finding the similarities in proteins became a fundamental procedure. In recent years, there has been a rapid development in alignment technologies. Alignment is the basic operation used to compare biological sequences and to determine the similarities that eventually result for structural, functional, or biological process relationships. These new technologies produce data in the order of numerous gigabyte-pairs per day. With the use of a Graphics Processing Unit (GPU), these data can be solved. We can utilize a GPU in computation as a massive parallel processor because the GPU consists of multiple pips. This new hardware creates new opportunities to study and improve current algorithms that are used for research in DNA alignment. In this thesis, we proposed a new algorithm to tackle this problem. We matched blocks of reference and target sequences based on the similarities between their empirical transition probabilities matrixes. The computations were conducted on an NVIDIA GTX 760, equipped with 2GB RAM, running Microsoft Windows 8.1 Professional. Our experimental results show robustness in nucleotide sequence alignment, and the parallelized transition probability indexing on a GPU achieves faster results than a former study of a proposed sequential method on a CPU.

URI

https://hdl.handle.net/11244/34622

Collections

OU - Theses [2188]

SHAREOK^TM

advancing Oklahoma scholarship, research and institutional memory