Show simple item record

dc.contributor.advisorCheng, Samuel
dc.contributor.authorKadiyala, Vishnu Priyatamkumar
dc.date.accessioned2022-05-11T20:27:00Z
dc.date.available2022-05-11T20:27:00Z
dc.date.issued2022-05-13
dc.identifier.urihttps://hdl.handle.net/11244/335695
dc.description.abstractThere has been an immense increase in number of scientific publications being published every single day, it has been increasingly difficult to keep up with all the new results being published. In this research, we localized and detected all the plots and tables from documents using deep neural networks. We generated a custom document dataset and manually annotated it to train and evaluate object detection models and their customizability. We used two Single shot multi detector models with base model of MobileNet, RetinaNet and CenterNet model. We trained these models over 10000 epochs on the custom generated dataset. All three models were able to localize and detect the plots and tables with accurately predicted bounding boxes. The results were as follows with CenterNet having the highest mAP score of 92 and highest AR of 93.88 followed by RetinaNet with mAP score of 91.1 and AR of 93.76 lastly, MobileNet based SSD with mAP score of 89.04 and AR of 91.54.en_US
dc.languageen_USen_US
dc.rightsAttribution 4.0 International*
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/*
dc.subjectObject Detectionen_US
dc.subjectCustom dataseten_US
dc.subjectLocalizationen_US
dc.subjectDeep Neural Networksen_US
dc.titleLocalization of tables and plots in documents using deep neural networksen_US
dc.contributor.committeeMemberZheng, Bin
dc.contributor.committeeMemberMetcalf, Justin
dc.date.manuscript2022
dc.thesis.degreeMaster of Scienceen_US
ou.groupGallogly College of Engineering::School of Electrical and Computer Engineeringen_US


Files in this item

Thumbnail
Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record


Attribution 4.0 International
Except where otherwise noted, this item's license is described as Attribution 4.0 International