TU Darmstadt / ULB / TUprints

Approximately Solving Mean Field Games via Entropy-Regularized Deep Reinforcement Learning

Cui, Kai ; Koeppl, Heinz (2022)
Approximately Solving Mean Field Games via Entropy-Regularized Deep Reinforcement Learning.
24th International Conference on Artificial Intelligence and Statistics (AISTATS) 2021. Virtual (13.04.2021-15.04.2021)
doi: 10.26083/tuprints-00021511
Conference or Workshop Item, Secondary publication, Publisher's Version

[img] Text
cui21a.pdf
Copyright Information: CC BY 4.0 International - Creative Commons, Attribution.

Download (750kB)
[img] Text
cui21a-supp.pdf
Copyright Information: CC BY 4.0 International - Creative Commons, Attribution.

Download (1MB)
Item Type: Conference or Workshop Item
Type of entry: Secondary publication
Title: Approximately Solving Mean Field Games via Entropy-Regularized Deep Reinforcement Learning
Language: English
Date: 2022
Place of Publication: Darmstadt
Year of primary publication: 2021
Publisher: PMLR
Book Title: Proceedings of The 24th International Conference on Artificial Intelligence and Statistics
Series: Proceedings of Machine Learning Research
Series Volume: 130
Event Title: 24th International Conference on Artificial Intelligence and Statistics (AISTATS) 2021
Event Location: Virtual
Event Dates: 13.04.2021-15.04.2021
DOI: 10.26083/tuprints-00021511
Corresponding Links:
Origin: Secondary publication service
Abstract:

The recent mean field game (MFG) formalism facilitates otherwise intractable computation of approximate Nash equilibria in many-agent settings. In this paper, we consider discrete-time finite MFGs subject to finite-horizon objectives. We show that all discrete-time finite MFGs with non-constant fixed point operators fail to be contractive as typically assumed in existing MFG literature, barring convergence via fixed point iteration. Instead, we incorporate entropy-regularization and Boltzmann policies into the fixed point iteration. As a result, we obtain provable convergence to approximate fixed points where existing methods fail, and reach the original goal of approximate Nash equilibria. All proposed methods are evaluated with respect to their exploitability, on both instructive examples with tractable exact solutions and high-dimensional problems where exact methods become intractable. In high-dimensional scenarios, we apply established deep reinforcement learning methods and empirically combine fictitious play with our approximations.

Status: Publisher's Version
URN: urn:nbn:de:tuda-tuprints-215111
Classification DDC: 000 Generalities, computers, information > 004 Computer science
500 Science and mathematics > 510 Mathematics
600 Technology, medicine, applied sciences > 620 Engineering and machine engineering
Divisions: 18 Department of Electrical Engineering and Information Technology > Institute for Telecommunications > Bioinspired Communication Systems
18 Department of Electrical Engineering and Information Technology > Self-Organizing Systems Lab
TU-Projects: HMWK|III L6-519/03/05.001-(0016)|emergenCity TP Bock
Date Deposited: 20 Jul 2022 13:34
Last Modified: 12 Apr 2023 07:25
URI: https://tuprints.ulb.tu-darmstadt.de/id/eprint/21511
PPN: 497909375
Export:
Actions (login required)
View Item View Item