Localization of tables and plots in documents using deep neural networks

dc.contributor.advisorCheng, Samuel
dc.contributor.authorKadiyala, Vishnu Priyatamkumar
dc.contributor.committeeMemberZheng, Bin
dc.contributor.committeeMemberMetcalf, Justin
dc.date.accessioned2022-05-11T20:27:00Z
dc.date.available2022-05-11T20:27:00Z
dc.date.issued2022-05-13
dc.date.manuscript2022
dc.description.abstractThere has been an immense increase in number of scientific publications being published every single day, it has been increasingly difficult to keep up with all the new results being published. In this research, we localized and detected all the plots and tables from documents using deep neural networks. We generated a custom document dataset and manually annotated it to train and evaluate object detection models and their customizability. We used two Single shot multi detector models with base model of MobileNet, RetinaNet and CenterNet model. We trained these models over 10000 epochs on the custom generated dataset. All three models were able to localize and detect the plots and tables with accurately predicted bounding boxes. The results were as follows with CenterNet having the highest mAP score of 92 and highest AR of 93.88 followed by RetinaNet with mAP score of 91.1 and AR of 93.76 lastly, MobileNet based SSD with mAP score of 89.04 and AR of 91.54.en_US
dc.identifier.urihttps://hdl.handle.net/11244/335695
dc.languageen_USen_US
dc.rightsAttribution 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/*
dc.subjectObject Detectionen_US
dc.subjectCustom dataseten_US
dc.subjectLocalizationen_US
dc.subjectDeep Neural Networksen_US
dc.thesis.degreeMaster of Scienceen_US
dc.titleLocalization of tables and plots in documents using deep neural networksen_US
ou.groupGallogly College of Engineering::School of Electrical and Computer Engineeringen_US

Files

Original bundle
Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
2022_Kadiyala_Vishnu Priyatamkumar_Thesis.pdf
Size:
1.47 MB
Format:
Adobe Portable Document Format
Description:
No Thumbnail Available
Name:
2022_Kadiyala_Vishnu Priyatamkumar_Thesis.docx
Size:
2.2 MB
Format:
Microsoft Word XML
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections