Logo des Repositoriums
  • English
  • Deutsch
Anmelden
Keine TU-ID? Klicken Sie hier für mehr Informationen.
  1. Startseite
  2. Publikationen
  3. Publikationen der Technischen Universität Darmstadt
  4. Zweitveröffentlichungen
  5. Generalized Cost-Based Job Scheduling in Very Large Heterogeneous Cluster Systems
 
  • Details
2020
Zweitveröffentlichung
Artikel
Postprint

Generalized Cost-Based Job Scheduling in Very Large Heterogeneous Cluster Systems

File(s)
Download
Hauptpublikation
cbsTpds.pdf
Urheberrechtlich geschützt
Format: Adobe PDF
Size: 4.83 MB
TUDa URI
tuda/8945
URN
urn:nbn:de:tuda-tuprints-216679
DOI
10.26083/tuprints-00021667
Autor:innen
Khuda Bukhsh, Wasiur R. ORCID 0000-0003-1803-0470
Kar, Sounak ORCID 0000-0002-0989-8596
Alt, Bastian ORCID 0000-0002-1522-5400
Rizk, Amr
Koeppl, Heinz ORCID 0000-0002-8305-9379
Kurzbeschreibung (Abstract)

We study job assignment in large, heterogeneous resource-sharing clusters of servers with finite buffers. This load balancing problem arises naturally in today's communication and big data systems, such as Amazon Web Services, Network Service Function Chains, and Stream Processing. Arriving jobs are dispatched to a server, following a load balancing policy that optimizes a performance criterion such as job completion time. Our contribution is a randomized Cost-Based Scheduling (CBS) policy in which the job assignment is driven by general cost functions of the server queue lengths. Beyond existing schemes, such as the Join the Shortest Queue (JSQ), the power of d or the SQ(d) and the capacity-weighted JSQ, the notion of CBS yields new application-specific policies such as hybrid locally uniform JSQ. As today's data center clusters have thousands of servers, exact analysis of CBS policies is tedious. In this article, we derive a scaling limit when the number of servers grows large, facilitating a comparison of various CBS policies with respect to their transient as well as steady state behavior. A byproduct of our derivations is the relationship between the queue filling proportions and the server buffer sizes, which cannot be obtained from infinite buffer models. Finally, we provide extensive numerical evaluations and discuss several applications including multi-stage systems.

Freie Schlagworte

Job Scheduling

performance evaluatio...

mean-field limit

Sprache
Englisch
Fachbereich/-gebiet
18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Nachrichtentechnik > Bioinspirierte Kommunikationssysteme
18 Fachbereich Elektrotechnik und Informationstechnik > Self-Organizing Systems Lab
DDC
000 Allgemeines, Informatik, Informationswissenschaft > 004 Informatik
500 Naturwissenschaften und Mathematik > 530 Physik
Institution
Universitäts- und Landesbibliothek Darmstadt
Ort
Darmstadt
Titel der Zeitschrift / Schriftenreihe
IEEE Transactions on Parallel and Distributed Systems
Jahrgang der Zeitschrift
31
Heftnummer der Zeitschrift
11
ISSN
1045-9219
Verlag
IEEE
Publikationsjahr der Erstveröffentlichung
2020
Verlags-DOI
10.1109/TPDS.2020.2997771
PPN
506774325

  • TUprints Leitlinien
  • Cookie-Einstellungen
  • Impressum
  • Datenschutzbestimmungen
  • Webseitenanalyse
Diese Webseite wird von der Universitäts- und Landesbibliothek Darmstadt (ULB) betrieben.