Show simple item record

dc.contributor.advisorCheng, Samuel
dc.contributor.advisorVerma, Pramode
dc.contributor.authorRoozgard, Aminmohammad
dc.date.accessioned2014-12-11T13:48:06Z
dc.date.available2014-12-11T13:48:06Z
dc.date.issued2014-11-10
dc.identifier.urihttps://hdl.handle.net/11244/13865
dc.description.abstractThe advance in human genome sequencing technology has significantly reduced the cost of data generation and overwhelms the computing capability of sequence analysis. Efficiency, efficacy and scalability remain challenging in sequence alignment, which is an important and foundational operation for genome data analysis. In this dissemination, I propose a two stage approach to tackle this problem. In the preprocessing step, I match blocks of reference and target genome sequences based on the similarities between their empirical transition probability distributions using belief propagation. I then conduct a refined match using our recently published SCoBeP technique. I extract features from neighbors of an input nucleotide (a genome sequence of neighboring nucleotides that the input nucleotide is its middle nucleotide) and leverage sparse coding to find a set of candidate nucleotides, followed by using Belief Propagation (BP) to rank these candidates. Our experimental results demonstrated robustness in nucleotide sequence alignment and our results are competitive to those of the SOAP aligner and the BWA algorithm . In addition, Most genomic datasets are not publicly accessible, due to privacy concerns. Patients genomic data contains identifiable markers and can be used to determine the presence of an individual in a dataset. Prior research shows that the re-identification can be possible when a very small set of genomic data is released. To protect patients, the data owners impose an application and evaluation procedure which often takes months to complete and limits the researchers. One solution to the problem is to let each data owner publish a set of pilot data to help data users choose the right datasets based on their needs. The data owners release these pilot data with the noise parameters and the mechanism that they used. A data user can run any kind of association tests and compare the outcomes with the other datasets outputs to get an idea which datasets can be useful. I present a privacy preserving genomic data dissemination algorithm based on the compressed sensing. In my proposed method, I am adding the noise into the sparse representation of the input vector to make it differentially private. It means I find the sparse representation using using the SubSpace Pursuit and then disturb it with sufficient Laplasian noise. I compare my method with state-of-the-art compressed sensing privacy protection method.en_US
dc.languageen_USen_US
dc.subjectMEDICAL SIGNALS ALIGNMENTen_US
dc.subjectPRIVACY PROTECTIONen_US
dc.titleMEDICAL SIGNALS ALIGNMENT AND PRIVACY PROTECTION USING BELIEF PROPAGATION AND COMPRESSED SENSINGen_US
dc.contributor.committeeMemberRay, William
dc.contributor.committeeMemberSluss, James
dc.contributor.committeeMemberLiu, Hong
dc.date.manuscript2014-11-10
dc.thesis.degreePh.D.en_US
ou.groupCollege of Engineering::School of Electrical and Computer Engineeringen_US


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record