TU Darmstadt / ULB / TUprints

Learning Mean-Field Control for Delayed Information Load Balancing in Large Queuing Systems

Tahir, Anam ; Cui, Kai ; Koeppl, Heinz (2024)
Learning Mean-Field Control for Delayed Information Load Balancing in Large Queuing Systems.
51st International Conference on Parallel Processing (ICPP ’22). Bordeaux, France (29.08. - 01.09.2022)
doi: 10.26083/tuprints-00026518
Conference or Workshop Item, Secondary publication, Publisher's Version

[img] Text
Copyright Information: CC BY 4.0 International - Creative Commons, Attribution.

Download (1MB)
Item Type: Conference or Workshop Item
Type of entry: Secondary publication
Title: Learning Mean-Field Control for Delayed Information Load Balancing in Large Queuing Systems
Language: English
Date: 5 February 2024
Place of Publication: Darmstadt
Year of primary publication: 2023
Place of primary publication: New York, NY, USA
Publisher: Association for Computing Machinery
Book Title: Proceedings of the 51st International Conference on Parallel Processing
Collation: 11 ungezählte Seiten
Event Title: 51st International Conference on Parallel Processing (ICPP ’22)
Event Location: Bordeaux, France
Event Dates: 29.08. - 01.09.2022
DOI: 10.26083/tuprints-00026518
Corresponding Links:
Origin: Secondary publication service

Recent years have seen a great increase in the capacity and parallel processing power of data centers and cloud services. To fully utilize the said distributed systems, optimal load balancing for parallel queuing architectures must be realized. Existing state-of-the-art solutions fail to consider the effect of communication delays on the behaviour of very large systems with many clients. In this work, we consider a multi-agent load balancing system, with delayed information, consisting of many clients (load balancers) and many parallel queues. In order to obtain a tractable solution, we model this system as a mean-field control problem with enlarged state-action space in discrete time through exact discretization. Subsequently, we apply policy gradient reinforcement learning algorithms to find an optimal load balancing solution. Here, the discrete-time system model incorporates a synchronization delay under which the queue state information is synchronously broadcasted and updated at all clients. We then provide theoretical performance guarantees for our methodology in large systems. Finally, using experiments, we prove that our approach is not only scalable but also shows good performance when compared to the state-of-the-art power-of-d variant of the Join-the-Shortest-Queue (JSQ) and other policies in the presence of synchronization delays.

Uncontrolled Keywords: load balancing, parallel systems, mean-field control, reinforcement learning
Identification Number: Artikel-Nr: 42
Status: Publisher's Version
URN: urn:nbn:de:tuda-tuprints-265180
Classification DDC: 000 Generalities, computers, information > 004 Computer science
600 Technology, medicine, applied sciences > 621.3 Electrical engineering, electronics
Divisions: ?? fb18_sos ??
Date Deposited: 05 Feb 2024 10:55
Last Modified: 09 Feb 2024 07:33
URI: https://tuprints.ulb.tu-darmstadt.de/id/eprint/26518
PPN: 515356379
Actions (login required)
View Item View Item