ShanghaiTech University Knowledge Management System
Stable and Efficient Shapley Value-Based Reward Reallocation for Multi-Agent Reinforcement Learning of Autonomous Vehicles | |
2022 | |
会议录名称 | PROCEEDINGS - IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION |
ISSN | 1050-4729 |
页码 | 8765-8771 |
发表状态 | 已发表 |
DOI | 10.1109/ICRA46639.2022.9811626 |
摘要 | With the development of sensing and communication technologies in networked cyber-physical systems (CPSs), multi-agent reinforcement learning (MARL)-based methodologies are integrated into the control process of physical systems and demonstrate prominent performance in a wide array of CPS domains, such as connected autonomous vehicles (CAVs). However, it remains challenging to mathematically characterize the improvement of the performance of CAVs with communication and cooperation capability. When each individual autonomous vehicle is originally self-interest, we can not assume that all agents would cooperate naturally during the training process. In this work, we propose to reallocate the system's total reward efficiently to motivate stable cooperation among autonomous vehicles. We formally define and quantify how to reallocate the system's total reward to each agent under the proposed transferable utility game, such that communication-based cooperation among multi-agents increases the system's total reward. We prove that Shapley value-based reward reallocation of MARL locates in the core if the transferable utility game is a convex game. Hence, the cooperation is stable and efficient and the agents should stay in the coalition or the cooperating group. We then propose a cooperative policy learning algorithm with Shapley value reward reallocation. In experiments, compared with several literature algorithms, we show the improvement of the mean episode system reward of CAV systems using our proposed algorithm. © 2022 IEEE. |
会议录编者/会议主办者 | IEEE ; IEEE Robotics and Automation Society (RA) |
关键词 | Autonomous agents Autonomous vehicles Embedded systems Fertilizers Game theory Learning algorithms Learning systems Multi agent systems Networked control systems Vehicle to vehicle communications Autonomous Vehicles Communicationtechnology Control process Multi-agent reinforcement learning Networked cyber-physical systems Performance Sensing technology Shapley value Transferable utility games Value-based |
会议名称 | 39th IEEE International Conference on Robotics and Automation, ICRA 2022 |
会议地点 | Philadelphia, PA, United states |
会议日期 | May 23, 2022 - May 27, 2022 |
URL | 查看原文 |
收录类别 | EI |
语种 | 英语 |
出版者 | Institute of Electrical and Electronics Engineers Inc. |
EI入藏号 | 20223312572868 |
EI主题词 | Reinforcement learning |
EI分类号 | 432 Highway Transportation ; 716.3 Radio Systems and Equipment ; 723.4 Artificial Intelligence ; 723.4.2 Machine Learning ; 731.1 Control Systems ; 731.2 Control System Applications ; 731.6 Robot Applications ; 804 Chemical Products Generally ; 821.2 Agricultural Chemicals ; 922.1 Probability Theory |
原始文献类型 | Conference article (CA) |
来源库 | IEEE |
引用统计 | |
文献类型 | 会议论文 |
条目标识符 | https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/223061 |
专题 | 信息科学与技术学院 信息科学与技术学院_硕士生 |
作者单位 | 1.Department of Computer Science and Engineering, University of Connecticut, Storrs Mansfield, CT, USA 2.School of Information Science and Technology, ShanghaiTech University, Shanghai, China 3.Electrical and Computer Engineering Department, University of California, San Diego, La Jolla, CA, USA |
推荐引用方式 GB/T 7714 | Songyang Han,He Wang,Sanbao Su,et al. Stable and Efficient Shapley Value-Based Reward Reallocation for Multi-Agent Reinforcement Learning of Autonomous Vehicles[C]//IEEE, IEEE Robotics and Automation Society (RA):Institute of Electrical and Electronics Engineers Inc.,2022:8765-8771. |
条目包含的文件 | 下载所有文件 | |||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 |
个性服务 |
查看访问统计 |
谷歌学术 |
谷歌学术中相似的文章 |
[Songyang Han]的文章 |
[He Wang]的文章 |
[Sanbao Su]的文章 |
百度学术 |
百度学术中相似的文章 |
[Songyang Han]的文章 |
[He Wang]的文章 |
[Sanbao Su]的文章 |
必应学术 |
必应学术中相似的文章 |
[Songyang Han]的文章 |
[He Wang]的文章 |
[Sanbao Su]的文章 |
相关权益政策 |
暂无数据 |
收藏/分享 |
修改评论
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。