ShanghaiTech University Knowledge Management System
Constrained Dueling Bandits for Edge Intelligence | |
2024 | |
发表期刊 | IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING (IF:6.7[JCR-2023],6.0[5-Year]) |
ISSN | 2334-329X |
EISSN | 2327-4697 |
卷号 | PP期号:99页码:1126-1136 |
发表状态 | 已发表 |
DOI | 10.1109/TNSE.2024.3524362 |
摘要 | Bandit is acknowledged as a classical analytic tool for the online decision-making problem under uncertainty, e.g., task assignment for crowdsourcing systems given the unknown reliability of workers. In the conventional setup, an agent selects from a set of arms across rounds to balance the exploitation-exploration tradeoff using quantitive reward feedback. Despite bandits' popularity, their practical implementation may run into concerns like 1) obtaining the quantitive reward is a non-trivial problem, e.g., evaluating workers' completion quality (reward) requires domain experts to set up metrics; 2) mismatch between the budgeted agent and costs for selecting arms, e.g., the crowdsourcing platform (agent) should offer payments (cost) to workers to complete tasks. To address such concerns, 1) we employ dueling bandits to learn the uncertainties via qualitative pairwise comparisons rather than quantitive rewards, e.g., whether a worker performs better on the assigned task than the other; 2) we utilize online control to guarantee a within-budget cost while selecting arms. By integrating online learning and online control, we propose a Constrained Two-Dueling Bandit (CTDB) algorithm. We prove that CTDB achieves a $O(1/V + \sqrt{\log T / T})$ round-averaged regret over the horizon $T$ while keeping a budgeted cost where $V$ is a constant parameter balancing the tradeoff between regret minimization and constraint satisfaction. We conduct extensive simulations with synthetic and real-world datasets to demonstrate the outperformance of CTDB over baselines. |
关键词 | Constrained optimization - Costs - Crowdsourcing - Decision making Bandit learning - Budget constraint - Dueling bandit - Long-term budget constraint - Lyapunov optimization - On-line controls - Optimisations - Quantitive - Uncertainty - Workers' |
URL | 查看原文 |
收录类别 | EI |
语种 | 英语 |
出版者 | IEEE Computer Society |
EI入藏号 | 20250217656615 |
EI主题词 | Budget control |
EI分类号 | 1201.7 Optimization Techniques - 911 Cost and Value Engineering ; Industrial Economics - 912.2 Management - 971 Social Sciences |
原始文献类型 | Journal article (JA) |
来源库 | IEEE |
文献类型 | 期刊论文 |
条目标识符 | https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/467855 |
专题 | 信息科学与技术学院 信息科学与技术学院_PI研究组_邵子瑜组 |
作者单位 | 1.Viterbi School of Engineering, University of Southern California (USC), Los Angeles, CA, USA 2.School of Information Science and Technology, ShanghaiTech University, Shanghai, China 3.Hong Kong University of Science and Technology (Guangzhou), China |
推荐引用方式 GB/T 7714 | Shangshang Wang,Ziyu Shao,Yang Yang. Constrained Dueling Bandits for Edge Intelligence[J]. IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING,2024,PP(99):1126-1136. |
APA | Shangshang Wang,Ziyu Shao,&Yang Yang.(2024).Constrained Dueling Bandits for Edge Intelligence.IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING,PP(99),1126-1136. |
MLA | Shangshang Wang,et al."Constrained Dueling Bandits for Edge Intelligence".IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING PP.99(2024):1126-1136. |
条目包含的文件 | ||||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 |
修改评论
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。