Show simple item record

dc.contributor.advisorFan, Guoliang
dc.contributor.authorGuo, Lin
dc.date.accessioned2021-05-25T20:32:03Z
dc.date.available2021-05-25T20:32:03Z
dc.date.issued2020-12
dc.identifier.urihttps://hdl.handle.net/11244/329911
dc.description.abstractIntelligent robots require advanced vision capabilities to perceive and interact with the real physical world. While computer vision has made great strides in recent years, its predominant paradigm still focuses on building deep-learning networks or handcrafted features to achieve semantic labeling or instance segmentation separately and independently. However, the two tasks should be synergistically unified in the recognition flow since they have a complementary nature in scene understanding.
dc.description.abstractThis dissertation presents the detection of instances in multiple scene understanding levels. Representations that enable intelligent systems to not only recognize what is seen (e.g. Does that pixel represent a chair?), but also predict contextual information about the complete 3D scene as a whole (e.g. How big is the chair? Is the chair placed next to a table?). More specifically, it presents a flow of understanding from local information to global fitness. First, we investigate in the 3D geometry information of instances. A new approach of generating tight cuboids for objects is presented. Then, we take advantage of the trained semantic labeling networks by using the intermediate layer output as a per-category local detector. Instance hypotheses are generated to help traditional optimization methods to get a higher instance segmentation accuracy. After that, to bring the local detection results to holistic scene understanding, our method optimizes object instance segmentation considering both the spacial fitness and the relational compatibility. The context information is implemented using graphical models which represent the scene level object placement in three ways: horizontal, vertical and non-placement hanging relations. Finally, the context information is implemented to a network structure. A deep learning-based re-inferencing frame work is proposed to boost any pixel-level labeling outputs using our local collaborative object presence (LoCOP) feature as the global-to-local guidance.
dc.description.abstractThis dissertation demonstrates that uniting pixel-level detection and instance segmentation not only significantly improves the overall performance for localized and individualized analysis, but also paves the way for holistic scene understanding.
dc.formatapplication/pdf
dc.languageen_US
dc.rightsCopyright is held by the author who has granted the Oklahoma State University Library the non-exclusive right to share this material in its institutional repository. Contact Digital Library Services at lib-dls@okstate.edu or 405-744-9161 for the permission policy on the use, reproduction or distribution of this material.
dc.titleHolistic indoor scene understanding by context supported instance segmentation
dc.contributor.committeeMemberHagan, Martin
dc.contributor.committeeMemberSheng, Weihua
dc.contributor.committeeMemberJacob, Jamey
osu.filenameGuo_okstate_0664D_17039.pdf
osu.accesstypeOpen Access
dc.type.genreDissertation
dc.type.materialText
dc.subject.keywordscontext
dc.subject.keywordsdeep learning
dc.subject.keywordsgraphical model
dc.subject.keywordsinstance segmentation
dc.subject.keywordsscene understanding
thesis.degree.disciplineElectrical Engineering
thesis.degree.grantorOklahoma State University


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record