Show simple item record

dc.contributor.advisorJohnson, Thomas P.
dc.contributor.authorMaddinani, Sarath Kumar
dc.date.accessioned2018-06-29T14:41:03Z
dc.date.available2018-06-29T14:41:03Z
dc.date.issued2016-12-01
dc.identifier.urihttps://hdl.handle.net/11244/300359
dc.description.abstractData cleaning, also called data cleansing is the process of detecting, correcting, or removing errors and inconsistencies from data in order to improve data quality. Data is classified into two types, namely, structured and unstructured. Standard techniques and tools are available to handle structured data. Most of the data generated in today's world is unstructured. A graph database stores data in the form of nodes and relationships between the nodes. Neo4j is a graph database tool which is accessed using the Cypher query language. We define and implement an extensive set of cleaning algorithms for spelling corrections using ontology and other inconsistencies in data in Neo4j. We conclude with the validation of these cleaning algorithms using data visualization techniques. The visualized results of the data before and after applying the cleaning algorithms proves the effectiveness of the proposed cleaning algorithms.
dc.formatapplication/pdf
dc.languageen_US
dc.rightsCopyright is held by the author who has granted the Oklahoma State University Library the non-exclusive right to share this material in its institutional repository. Contact Digital Library Services at lib-dls@okstate.edu or 405-744-9161 for the permission policy on the use, reproduction or distribution of this material.
dc.titleData Cleaning on Graph Databases Using NEO4J: Spelling Correction Using Ontology and Visualization
dc.contributor.committeeMemberChristopher, Crick
dc.contributor.committeeMemberDavid, Cline
osu.filenameMaddinani_okstate_0664M_15001.pdf
osu.accesstypeOpen Access
dc.description.departmentComputer Science
dc.type.genreThesis
dc.type.materialtext


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record