MAGE:面向战略探索与开发的语言代理的元强化学习

📄 中文摘要

大型语言模型(LLM)代理在学习任务中表现出色,但在具有反馈的非平稳环境中适应能力不足。尽管上下文学习和外部记忆提供了一定的灵活性,但未能内化长期改进所需的适应能力。元强化学习(meta-RL)通过将学习过程直接嵌入模型中,提供了一种替代方案。然而,现有的LLM元强化学习方法主要集中于单代理设置中的探索,忽视了多代理环境中所需的战略开发能力。MAGE是一个元强化学习框架,旨在增强LLM代理的战略探索与开发能力,利用多种策略提升其在复杂环境中的表现。

📄 English Summary

MAGE: Meta-Reinforcement Learning for Language Agents toward Strategic Exploration and Exploitation

Large Language Model (LLM) agents exhibit impressive capabilities in various learned tasks, yet they often face challenges in adapting to non-stationary environments with feedback. While In-Context Learning and external memory provide some degree of flexibility, they do not effectively internalize the adaptive skills necessary for long-term enhancement. Meta-Reinforcement Learning (meta-RL) offers an alternative by embedding the learning process within the model itself. However, existing meta-RL strategies for LLMs primarily focus on exploration in single-agent contexts, overlooking the strategic exploitation required in multi-agent scenarios. MAGE is proposed as a meta-RL framework designed to empower LLM agents for both strategic exploration and exploitation, leveraging various strategies to improve their performance in complex environments.

Powered by Cloudflare Workers + Payload CMS + Claude 3.5

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等