Veras, RichardAbdelaal, Khaled2024-04-252024-04-252024-05-10https://hdl.handle.net/11244/340238Sparsity manifests itself in a multitude of modern High-Performance Computing applications including graph neural networks, network analytics, and scientific computing. In sparse matrices, the majority of values are zeros. Traditional methods of storing and processing dense data are unsuitable for the new nature of sparse data, as they end up wasting storage and compute on zeros. Hence, a variety of sparse data formats that store only the non-zero elements were proposed in literature to provide a compact representation of sparse data. Performance of operations on sparse data mainly depends on the sparse data format used for storing the data, as the algorithm needs to closely match the sparse data format. However, choosing the optimal sparse data format for the input sparse matrix is non-trivial, as the optimal format depends on the sparsity pattern of the input sparse matrix. For example, in sparse matrix-vector multiplication (SpMV), for the same input sparse matrix, using different sparse data formats can yield highly variant performance. The best format being the one that closely matches how the non-zeros are arranged within the matrix. Additionally, performance prediction for operations involving sparse matrices is not as straightforward as it used to be for the dense case. For dense computations, dimensions and strides suffice for performance predictions as they provide a sense of the number of floating point operations (FLOPs) to be performed, and how this number compares to the architecture properties (peak FLOPs, number of processing elements, etc.). On the other hand, sparse matrix dimensions do not directly convey useful information about the total number of operations to be performed, since the majority of elements are zeros and do not contribute to the total number of FLOPs. Moreover, existing work on sparse operations optimizations mainly depends on a discrete set of matrices, limiting the ability to generalize observations. To address these challenges, we identify the sparsity pattern as the main driving factor for performance. First, we propose an extensible classifier framework to automatically identify the sparsity pattern of the input sparse matrix. This framework uses graph neural networks (GNNs) by representing the input sparse matrix as a graph, and then learning the structural relationship between nodes in the matrix (graph). Our framework achieves up to 98% classification accuracy on full graphs, the same accuracy for scrambled matrices, and 92% accuracy for small random subsamples taken out of original graphs. Second, we use graph models as a proxy to generate large-scale synthetic sparse matrices. We propose another modular framework to study the correlation between the graph model parameters, and the structure of the resulting graph and its tolerance to noise during the generation step. Third, we also use graph models for a performance evaluation framework that can assist in finding the best sparse format for a given graph model on a given architecture, utilizing the graph model parameters as a representative set of features to predict performance, and providing a more robust way of visualizing sparse matrix operations performance. This framework also takes into consideration noise in the matrix generation step, and evaluates the extent of noise to which performance can still be predictable based on the graph model parameters. Our results show that sparse computations need richer models to categorize sparse matrices in terms of structure, study the sensitivity of such models even for one structure, and tie performance to a more descriptive set of parameters.Computer ScienceHigh-Performance ComputingSparse Linear AlgebraPerformance Analysis and OptimizationGraph WorkloadsBridging the Gap between Sparse Matrix Computation and Graph Models