Show simple item record

dc.contributor.advisorThomas, Johnson P.
dc.contributor.authorPuram, Varun Teja
dc.date.accessioned2023-04-12T19:38:36Z
dc.date.available2023-04-12T19:38:36Z
dc.date.issued2022-07
dc.identifier.urihttps://hdl.handle.net/11244/337374
dc.description.abstractIn Machine Learning, the most critical assumption is that training and testing datasets should have similar distributions. The model will be effective if the new test data is similar to the past data on which the model was trained. If there are substantial differences between the training data and the testing data, the machine learning algorithm will generate results that are not very accurate. In many applications, the data has dynamic periodicity, that is, the data changes with time. As the distribution of the data keeps changing, at some point, the model will therefore have to be retrained.
dc.description.abstractIn this research I look at the dynamic behavior of graph data. As data changes, there will be addition/deletions of nodes/edges of the graph. As we are dealing with large sets of graph data, we use embedding vector spaces (for graph data) for training and testing. Embedding vector spaces in each timestamp are different and training the model each time when data changes is expensive. To address these challenges, we use the dfs_dynode2vec algorithm where the current timestamp graph embedding vectors initializes from the previous embedding vectors. For each timestamp, data might change significantly or insignificantly. We propose a statistical model ‘Significant testing’ which determines whether the model should be retrained or not. If the change is insignificant, the model need not to be trained again and embedded vectors for that timestamp are not generated. We have considered several aspects in determining the statistical significance of the change. These include edge centrality, betweenness centrality and norm calculations.
dc.formatapplication/pdf
dc.languageen_US
dc.rightsCopyright is held by the author who has granted the Oklahoma State University Library the non-exclusive right to share this material in its institutional repository. Contact Digital Library Services at lib-dls@okstate.edu or 405-744-9161 for the permission policy on the use, reproduction or distribution of this material.
dc.titleModel re-training for dynamic graphs
dc.contributor.committeeMemberGeorge, K. M.
dc.contributor.committeeMemberMayfield, Blayne
osu.filenamepuram_okstate_0664m_17843.pdf
osu.accesstypeOpen Access
dc.type.genreThesis
dc.type.materialText
thesis.degree.disciplineComputer Science
thesis.degree.grantorOklahoma State University


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record