Accelerating Federated Learning with a Global Biased Optimiser
Federated Learning (FL) is a recent development in the field of machine learning that collaboratively trains models without the training data leaving client devices, to preserve data privacy. In realistic FL settings, the training set is distributed over clients in a highly non-Independent and Identically Distributed (non-IID) fashion, which has been shown extensively to harm FL convergence speed and final model performance. To address this challenge, we propose a novel, generalised approach for incorporating adaptive optimisation techniques into FL with the Federated Global Biased Optimiser (FedGBO) algorithm. FedGBO accelerates FL by employing a set of global biased optimiser values during the client-training phase, which helps to reduce `client-drift' from non-IID data, whilst also benefiting from adaptive optimisation. We show that the FedGBO update with a generic optimiser can be reformulated as centralised training using biased gradients and optimiser updates, and apply this theoretical framework to prove the convergence of FedGBO using momentum-Stochastic Gradient Descent (SGDm). We also conduct extensive experiments using 4 realistic FL benchmark datasets (CIFAR100, Sent140, FEMNIST, Shakespeare) and 3 popular adaptive optimisers (RMSProp, SGDm, Adam) to compare the performance of state-of-the-art adaptive-FL algorithms. The results demonstrate that FedGBO has highly competitive performance whilst achieving lower communication and computation costs, and provide practical insights into the trade-offs associated with the different adaptive-FL algorithms and optimisers for real-world FL deployments.
READ FULL TEXT