多智能体强化学习在动态定价中的应用：盈利性、稳定性与公平性的平衡

出处: Multi-Agent Reinforcement Learning for Dynamic Pricing: Balancing Profitability,Stability and Fairness

发布: 2026年3月19日

📄 中文摘要

动态定价在竞争性零售市场中需要根据需求波动和竞争者行为进行适应性策略。研究系统性地评估了多智能体强化学习（MARL）方法，特别是MAPPO和MADDPG在竞争环境下的动态价格优化。利用基于真实零售数据构建的模拟市场环境，基准测试这些算法与独立DDPG（IDDPG）基线进行比较，后者是MARL文献中广泛使用的独立学习者。评估指标包括利润表现、随机种子下的稳定性、公平性和训练效率。结果表明，MAPPO在平均收益和低方差方面表现出色，提供了稳定且可重复的结果。

🏷️ 相关标签

#动态定价 #多智能体强化学习 #MAPPO #MADDPG #竞争市场

📄 English Summary

Multi-Agent Reinforcement Learning for Dynamic Pricing: Balancing Profitability,Stability and Fairness

Dynamic pricing in competitive retail markets necessitates strategies that adapt to fluctuating demand and competitor behavior. This research presents a systematic empirical evaluation of multi-agent reinforcement learning (MARL) approaches, specifically MAPPO and MADDPG, for dynamic price optimization under competition. A simulated marketplace environment derived from real-world retail data is utilized to benchmark these algorithms against an Independent DDPG (IDDPG) baseline, a widely used independent learner in MARL literature. The evaluation focuses on profit performance, stability across random seeds, fairness, and training efficiency. Results indicate that MAPPO consistently achieves the highest average returns with low variance, offering a stable and reproducible solution.

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等

📄 中文摘要

🏷️ 相关标签

📄 English Summary

Multi-Agent Reinforcement Learning for Dynamic Pricing: Balancing Profitability,Stability and Fairness

🏷️ Related Tags

📚 相关文章

AI 编程创造了新一类创作者。我就是其中之一。

人工智能成为我学习的助手

Claude CLI "泄露": 没有人赢，AI 仍然幻觉，企业仍在犯同样的错误