Keynote Talk : Byzantine Robustness and Partial Participation Can Be Achieved Simultaneously: Just Clip Gradient Differences

Abstract:

 Distributed learning has emerged as a leading paradigm for training large machine learning models. However, in real-world scenarios, participants may be unreliable or malicious, posing a significant challenge to the integrity and accuracy of the trained models. Byzantine fault tolerance mechanisms have been proposed to address these issues, but they often assume full participation from all clients, which is not always practical due to the unavailability of some clients or communication constraints. In our work, we propose the first distributed method with client sampling and provable tolerance to Byzantine workers. The key idea behind the developed method is the use of gradient clipping to control stochastic gradient differences in recursive variance reduction. This allows us to bound the potential harm caused by Byzantine workers, even during iterations when all sampled clients are Byzantine. Furthermore, we incorporate communication compression into the method to enhance communication efficiency. Under quite general assumptions, we prove convergence rates for the proposed method that match the existing state-of-the-art (SOTA) theoretical results.

Dates

February 29 ,2024 March 11 ,2024

Abstract submission deadline

March 7 ,2024 March 18 ,2024

Paper submission deadline

April 22 ,2024

Accept/Reject notification

May 12 ,2024

Camera ready copy due

May 27-28 ,2024

Metis Spring school

May 29-31 ,2024

Netys Conference

Proceedings

Revised selected papers will be published as a post-proceedings in Springer's LNCS "Lecture Notes in Computer Science"

Partners & Sponsors