Show simple item record

dc.contributor.advisorRadhakrishnan, Sridhar
dc.contributor.authorGopal Krishna, Sudhindra
dc.date.accessioned2023-08-14T18:53:28Z
dc.date.available2023-08-14T18:53:28Z
dc.date.issued2023-08-04
dc.identifier.urihttps://hdl.handle.net/11244/338855
dc.description.abstractA graph G = (V, E) is an ordered tuple where V is a non-empty set of elements called vertices (nodes), and E is a set of an unordered pair of elements called links (edges), and a time-evolving graph is a change in the states of the edges over time. With the growing popularity of social networks and the massive influx of users, it is becoming a challenging task to store the network/graph and process them as fast as possible before the property of the graph changes with the graph evolution. Graphs or networks are a collection of entities (individuals in a social network) and their relationships (friends, followers); ways to represent a graph can help how the information could be extracted. The increase in the number of users increases the relationship the user has, which makes the graphs massive and nearly impossible to store them in friendly structures such as a matrix or an adjacency list. Therefore, an exciting area of research is storing these massive graphs with a smaller memory footprint and processing with very little extra memory. But there is always a trade-off with time and space; to get a small memory footprint, one has to remove the redundancy rigorously, which consumes time. In the same way, when traversing these tight spaces, the time required to query also increases compared to a matrix or an adjacency list. In this dissertation, we provide the encoding technique to store the arrays in the Compressed Sparse Row (CSR) data structure and extend the encoding to store time-evolving graphs in the form of a CSR. We also propose combinations of two structures (CSR + CBT) to store the time-evolving graphs and to improve the time and space trade-off. Encoding also enables one to access a node without decompressing the entire structure, which means that the data structure can be accessed. We then provide four ways to store multi-dimensional data, which represents intricate relations within the social network. Once the data are stored in compressed format, it is important to provide algorithms that support the structures. One such computation which is the basis for any graph algorithm is matrix-multiplication. We now extend our work to perform value-based matrix multiplication on compressed structures. We test our algorithm on extremely large matrices, in the order of 100s of millions with various levels of sparsity. Using matrix-matrix multiplication and keeping the theme of storing the data in small spaces, we propose another way of compression is through the dimensionality reduction, which is referred to as Matrix Factorization. Performing any of these operations on a compressed structure without decompressing would be time consuming. Therefore, in this dissertation, we introduce a parallel technique to construct the graph and also run a list of queries using the querying algorithms, such as fetching neighbor or edge existence in parallel. We also extend our work to propose parallel time-evolving differential compression of CSR using the prefix sum approach.en_US
dc.languageen_USen_US
dc.subjectData Compressionen_US
dc.subjectTensorsen_US
dc.subjectTime-Evolving Graphsen_US
dc.subjectMatrix Operationsen_US
dc.subjectParallel Computationen_US
dc.titleCompression techniques for extreme-scale graphs and matrices: sequential and parallel algorithmsen_US
dc.contributor.committeeMemberNicholson, Charles
dc.contributor.committeeMemberAntonio, John
dc.contributor.committeeMemberPan, Chongle
dc.date.manuscript2023-07-28
dc.thesis.degreePh.D.en_US
ou.groupGallogly College of Engineering::School of Computer Scienceen_US
shareok.orcid0000-0002-2433-5855en_US


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record