Enhanced Capsule-based Networks and Their Applications

Zhao, Zhihao

dc.contributor.advisor	Cheng, Samuel
dc.contributor.author	Zhao, Zhihao
dc.date.accessioned	2022-12-09T21:29:32Z
dc.date.available	2022-12-09T21:29:32Z
dc.date.issued	2022-12
dc.identifier.uri	https://hdl.handle.net/11244/336912
dc.description.abstract	Current deep models have achieved human-like accuracy in many computer vision tasks, even defeating humans sometimes. However, these deep models still suffer from significant weaknesses. To name a few, it is hard to interpret how they reach decisions, and it is easy to attack them with tiny perturbations. A capsule, usually implemented as a vector, represents an object or object part. Capsule networks and GLOM consist of classic and generalized capsules respectively, where the difference is whether the capsule is limited to representing a fixed thing. Both models are designed to parse their input into a part-whole hierarchy as humans do, where each capsule corresponds to an entity of the hierarchy. That is, the first layer finds the lowest-level vision patterns, and the following layers assemble the larger patterns till the entire object, e.g., from nostril to nose, face, and person. This design enables capsule networks and GLOM the potential of solving the above problems of current deep models, by mimicking how humans overcome these problems with the part-whole hierarchy. However, their current implementations are not perfect on fulfilling their potentials and require further improvements, including intrinsic interpretability, guaranteed equivariance, robustness to adversarial attacks, a more efficient routing algorithm, compatibility with other models, etc. In this dissertation, I first briefly introduce the motivations, essential ideas, and existing implementations of capsule networks and GLOM, then focus on addressing some limitations of these implementations. The improvements are briefly summarized as follows. First, a fast non-iterative routing algorithm is proposed for capsule networks, which facilitates their applications in many tasks such as image classification and segmentation. Second, a new architecture, named Twin-Islands, is proposed based on GLOM, which achieves the many desired properties such as equivariance, model interpretability, and adversarial robustness. Lastly, the essential idea of capsule networks and GLOM is re-implemented in a small group ensemble block, which can also be used along with other types of neural networks, e.g., CNNs, on various tasks such as image classification, segmentation, and retrieval.	en_US
dc.language	en_US	en_US
dc.rights	Attribution-NonCommercial-ShareAlike 4.0 International	*
dc.rights.uri	https://creativecommons.org/licenses/by-nc-sa/4.0/	*
dc.subject	Capsule Networks and GLOM	en_US
dc.subject	Part-whole Relationships	en_US
dc.subject	Routing and Propagation Algorithms	en_US
dc.subject	Random Subspace Ensemble	en_US
dc.title	Enhanced Capsule-based Networks and Their Applications	en_US
dc.contributor.committeeMember	MacDonald, Gregory
dc.contributor.committeeMember	Zhao, Shangqing
dc.contributor.committeeMember	Zheng, Bin
dc.date.manuscript	2022-12
dc.thesis.degree	Ph.D.	en_US
ou.group	Gallogly College of Engineering::School of Electrical and Computer Engineering	en_US
shareok.orcid	0000-0001-6135-3510	en_US
shareok.nativefileaccess	restricted	en_US