Show simple item record

dc.contributor.advisorCheng, Samuel
dc.contributor.authorZhao, Zhihao
dc.date.accessioned2022-12-09T21:29:32Z
dc.date.available2022-12-09T21:29:32Z
dc.date.issued2022-12
dc.identifier.urihttps://hdl.handle.net/11244/336912
dc.description.abstractCurrent deep models have achieved human-like accuracy in many computer vision tasks, even defeating humans sometimes. However, these deep models still suffer from significant weaknesses. To name a few, it is hard to interpret how they reach decisions, and it is easy to attack them with tiny perturbations. A capsule, usually implemented as a vector, represents an object or object part. Capsule networks and GLOM consist of classic and generalized capsules respectively, where the difference is whether the capsule is limited to representing a fixed thing. Both models are designed to parse their input into a part-whole hierarchy as humans do, where each capsule corresponds to an entity of the hierarchy. That is, the first layer finds the lowest-level vision patterns, and the following layers assemble the larger patterns till the entire object, e.g., from nostril to nose, face, and person. This design enables capsule networks and GLOM the potential of solving the above problems of current deep models, by mimicking how humans overcome these problems with the part-whole hierarchy. However, their current implementations are not perfect on fulfilling their potentials and require further improvements, including intrinsic interpretability, guaranteed equivariance, robustness to adversarial attacks, a more efficient routing algorithm, compatibility with other models, etc. In this dissertation, I first briefly introduce the motivations, essential ideas, and existing implementations of capsule networks and GLOM, then focus on addressing some limitations of these implementations. The improvements are briefly summarized as follows. First, a fast non-iterative routing algorithm is proposed for capsule networks, which facilitates their applications in many tasks such as image classification and segmentation. Second, a new architecture, named Twin-Islands, is proposed based on GLOM, which achieves the many desired properties such as equivariance, model interpretability, and adversarial robustness. Lastly, the essential idea of capsule networks and GLOM is re-implemented in a small group ensemble block, which can also be used along with other types of neural networks, e.g., CNNs, on various tasks such as image classification, segmentation, and retrieval.en_US
dc.languageen_USen_US
dc.rightsAttribution-NonCommercial-ShareAlike 4.0 International*
dc.rights.urihttps://creativecommons.org/licenses/by-nc-sa/4.0/*
dc.subjectCapsule Networks and GLOMen_US
dc.subjectPart-whole Relationshipsen_US
dc.subjectRouting and Propagation Algorithmsen_US
dc.subjectRandom Subspace Ensembleen_US
dc.titleEnhanced Capsule-based Networks and Their Applicationsen_US
dc.contributor.committeeMemberMacDonald, Gregory
dc.contributor.committeeMemberZhao, Shangqing
dc.contributor.committeeMemberZheng, Bin
dc.date.manuscript2022-12
dc.thesis.degreePh.D.en_US
ou.groupGallogly College of Engineering::School of Electrical and Computer Engineeringen_US
shareok.orcid0000-0001-6135-3510en_US
shareok.nativefileaccessrestricteden_US


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record


Attribution-NonCommercial-ShareAlike 4.0 International
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-ShareAlike 4.0 International