Distributed randomized block stochastic gradient tracking methods: Rate analysis and numerical experiments
Abstract
Distributed optimization has been a trending topic of research in the past few decades. This is mainly due to the recent advancements in the technology of wireless sensors and also the emerging applications in machine learning. Traditionally, optimization problems were addressed using centralized schemes where the data is assumed to be available all in one place. However, the main reasons that motivate the need for distributed implementations include: (i) the unavailability of the collected data in a centralized location, (ii) the privacy of the data among agents should be preserved, and (iii) the memory and computational power limitations of data processors. Accordingly, to address these challenges, distributed optimization provides a framework where agents (e.g., data processor, sensor) communicate their local information with each other over a network and seek to minimize a global objective function. In some applications, the data may have a huge sample size or a large number of attributes. The problems associated with this type of data are often known as big data problems. In this thesis, our goal is to address such high dimensional distributed optimizationproblems, where the computation of the local gradient mappings may become expensive. Recently, a distributed optimization algorithm has been developed for addressing possibly large-scale problems by considering stochasticity. This method is called Distributed Stochastic Gradient Tracking (DSGT). We develop a novel iterative method called Distributed Randomized Block Stochastic Gradient Tracking (DRBSGT), that is a randomized block variant of the existing DSGT method. We derive new non-asymptotic convergence rates of the order 1/k and 1/k^2 in terms of an optimality metric and a consensus violation metric, respectively. Importantly, while block coordinate schemes have been studied for distributed optimization problems before, the proposed algorithm appears to be the first randomized block-coordinate gradient tracking method that is equipped with the aforementioned convergence rate statements. We validate the performance of the proposed method on the MNIST and a synthetic data set under different network settings. A potential future research direction is to extend the results of this thesis to an asynchronous variant of the proposed method. This will allow for the consideration of communication delays.
Collections
- OSU Theses [15752]