Show simple item record

dc.contributor.advisorYen, Gary G.
dc.contributor.authorWu, Zheng
dc.date.accessioned2013-12-10T18:05:53Z
dc.date.available2013-12-10T18:05:53Z
dc.date.issued2006-07
dc.identifier.urihttps://hdl.handle.net/11244/7879
dc.description.abstractThe Self-Organizing Map (SOM) is an unsupervised neural network model that provides topology-preserving mapping from high-dimensional input spaces onto a commonly two-dimensional output space. In this study, the clustering and visualization capabilities of the SOM, especially in the analysis of textual data, i.e. document collections, are reviewed and further developed. A novel clustering and visualization approach based on the SOM is proposed for the task of text data mining. The proposed approach first transforms the document space into a multi-dimensional vector space by means of document encoding. Then a growing hierarchical SOM (GHSOM) is trained and used as a baseline framework, which automatically produces maps with various levels of details. Following the training of the GHSOM, a novel projection method, namely the Ranked Centroid Projection (RCP), is applied to project the input vectors onto a hierarchy of two-dimensional output maps. The projection of the input vectors is treated as a vector interpolation into a two-dimensional regular map grid. A ranking scheme is introduced to select the nearest R units around the input vector in the original data space, the positions of which will be taken into account in computing the projection coordinates.
dc.description.abstractThe proposed approach can be used both as a data analysis tool and as a direct interface to the data. Its applicability has been demonstrated in this study using an illustrative data set and two real-world document clustering tasks, i.e. the SOM paper collection and the Anthrax paper collection. Based on the proposed approach, a software toolbox is designed for analyzing and visualizing document collections, which provides a user-friendly interface and several exploration and analysis functions.
dc.description.abstractThe presented SOM-based approach incorporates several unique features, such as the adaptive structure, the hierarchical training, the automatic parameter adjustment and the incremental clustering. Its advantages include the ability to convey a large amount of information in a limited space with comparatively low computation load, the potential to reveal conceptual relationships among documents, and the facilitation of perceptual inferences on both inter-cluster and within-cluster relationships.
dc.formatapplication/pdf
dc.languageen_US
dc.rightsCopyright is held by the author who has granted the Oklahoma State University Library the non-exclusive right to share this material in its institutional repository. Contact Digital Library Services at lib-dls@okstate.edu or 405-744-9161 for the permission policy on the use, reproduction or distribution of this material.
dc.titleRanked centroid projection: A data visualization approach based on self-organizing maps
dc.contributor.committeeMemberTeague, Keith
dc.contributor.committeeMemberYarlagadda, Radha K. Rao
dc.contributor.committeeMemberDeYong, Camille F.
osu.filenameWu_okstate_0664D_1941.pdf
osu.accesstypeOpen Access
dc.type.genreDissertation
dc.type.materialText
thesis.degree.disciplineElectrical and Computer Engineering
thesis.degree.grantorOklahoma State University


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record