Distributed Artificial Intelligence: Second International Conference, DAI 2020, Nanjing, China, October 24–27, 2020, Proceedings

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

This book constitutes the refereed proceedings of the Second International Conference on Distributed Artificial Intelligence, DAI 2020, held in Nanjing, China, in October 2020.

The 9 full papers presented in this book were carefully reviewed and selected from 22 submissions. DAI aims at bringing together international researchers and practitioners in related areas including general AI, multiagent systems, distributed learning, computational game theory, etc., to provide a single, high-profile, internationally renowned forum for research in the theory and practice of distributed AI.

Due to the Corona pandemic this event was held virtually.

Author(s): Matthew E. Taylor, Yang Yu, Edith Elkind, Yang Gao
Series: Lecture Notes in Computer Science, 12547
Publisher: Springer
Year: 2020

Language: English
Pages: 141
City: Cham

Preface
Organization
Contents
Parallel Algorithm for Nash Equilibrium in Multiplayer Stochastic Games with Application to Naval Strategic Planning
1 Introduction
2 Hostility Game
3 Algorithm
4 Experiments
5 Conclusion
References
LAC-Nav: Collision-Free Multiagent Navigation Based on the Local Action Cells
1 Introduction
2 The Local Action Cells
3 Collision-Free Navigation
4 Experiments
5 Discussions
References
MGHRL: Meta Goal-Generation for Hierarchical Reinforcement Learning
1 Introduction
2 Related Work
3 Preliminaries
4 Algorithm
4.1 Two-Level Hierarchy
4.2 Meta Goal-Generation for Hierarchical Reinforcement Learning
5 Experiments
5.1 Environmental Setup
5.2 Results
6 Discussion and Future Work
References
D3PG: Decomposed Deep Deterministic Policy Gradient for Continuous Control
1 Introduction
2 Background
2.1 Reinforcement Learning (RL)
2.2 Deep Deterministic Policy Gradient (DDPG)
3 The D3PG Algorithm for Robotic Control
3.1 Structural Decomposition
3.2 The PCG Method
3.3 The D3PG Algorithm
4 Experiment
5 Related Work
6 Conclusions
A Appendix
A.1 Appendix
A.2 MuJoCo Platform
References
Lyapunov-Based Reinforcement Learning for Decentralized Multi-agent Control
1 Introduction
2 Preliminaries
2.1 Networked Markov Game
2.2 Soft Actor-Critic Algorithm
2.3 Lyapunov Stability in Control Theory
3 Multi-agent Reinforcement Learning with Lyapunov Stability Constraint
3.1 Multi-agent Soft Actor-Critic Algorithm
3.2 Lyapunov Stability Constraint
4 Experiment
5 Conclusion
References
Hybrid Independent Learning in Cooperative Markov Games
1 Introduction
2 Theoretical Framework
2.1 Markov Games
2.2 Policies and Nash Equilibria
2.3 Q-Learning
3 Hybrid Q-Learning
4 Pathologies in Multi-Agent RL
4.1 Relative Overgeneralization
4.2 The Stochasticity Problem
4.3 Miscoordination
4.4 The Alter-Exploration Problem
5 Independent Learner Baselines
5.1 Independent Q-Learning
5.2 Distributed Q-Learning
5.3 Hysteretic Q-Learning
5.4 LMRL2
5.5 Parameters
6 Experiments
6.1 Climb Games
6.2 Heaven and Hell Game
6.3 Common Interest Game
6.4 Meeting in a Grid
7 Conclusions
References
Efficient Exploration by Novelty-Pursuit
1 Introduction
2 Related Work
3 Background
4 Method
4.1 Selecting Goals from the Experience Buffer
4.2 Training Goal-Conditioned Policy Efficiently
4.3 Exploiting Experience Collected by Exploration Policy
5 Experiment
5.1 Comparison of Exploration Efficiency
5.2 Ablation Study of Training Techniques
5.3 Evaluation on Complicated Environments
6 Conclusion
A Appendix
A.1 Reward Shaping for Training Goal-Conditioned Policy
A.2 Additional Results
A.3 Environment Prepossessing
A.4 Network Architecture
A.5 Hyperparameters
References
Context-Aware Multi-agent Coordination with Loose Couplings and Repeated Interaction
1 Introduction
2 Motivation Scenario
3 Problem Description
4 Algorithms
4.1 Description of MACUCB
4.2 Description of VE
4.3 Extensions
5 Regret Analysis
6 Experiment
6.1 Experiment Setting
6.2 Experimental Results
7 Conclusion
References
Battery Management for Automated Warehouses via Deep Reinforcement Learning
1 Introduction
2 Related Work
3 Motivation Scenario
4 Problem Statement and MDP Formulation
4.1 Battery Management Problem
4.2 MDP Formulation
5 Solving the MDP
5.1 TD3
5.2 Enforcing State Dependent Exploration via Action Regulation Loss
6 Simulator Design
7 Empirical Evaluation
7.1 Experimental Configurations
7.2 Experimental Results
8 Conclusion
References
Author Index