多智能体强化学习在动态定价中的应用:盈利性、稳定性与公平性的平衡

📄 中文摘要

动态定价在竞争性零售市场中需要根据需求波动和竞争者行为进行适应性策略。研究系统性地评估了多智能体强化学习(MARL)方法,特别是MAPPO和MADDPG在竞争环境下的动态价格优化。利用基于真实零售数据构建的模拟市场环境,基准测试这些算法与独立DDPG(IDDPG)基线进行比较,后者是MARL文献中广泛使用的独立学习者。评估指标包括利润表现、随机种子下的稳定性、公平性和训练效率。结果表明,MAPPO在平均收益和低方差方面表现出色,提供了稳定且可重复的结果。

📄 English Summary

Multi-Agent Reinforcement Learning for Dynamic Pricing: Balancing Profitability,Stability and Fairness

Dynamic pricing in competitive retail markets necessitates strategies that adapt to fluctuating demand and competitor behavior. This research presents a systematic empirical evaluation of multi-agent reinforcement learning (MARL) approaches, specifically MAPPO and MADDPG, for dynamic price optimization under competition. A simulated marketplace environment derived from real-world retail data is utilized to benchmark these algorithms against an Independent DDPG (IDDPG) baseline, a widely used independent learner in MARL literature. The evaluation focuses on profit performance, stability across random seeds, fairness, and training efficiency. Results indicate that MAPPO consistently achieves the highest average returns with low variance, offering a stable and reproducible solution.

Powered by Cloudflare Workers + Payload CMS + Claude 3.5

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等