InterPReT:交互式策略重构与训练赋能非专业人员高效模仿学习

📄 中文摘要

模仿学习在机器人领域前景广阔,但从非专业人员演示中学习面临挑战,主要源于演示质量差、策略结构不佳以及缺乏有效反馈。本文提出InterPReT框架,通过交互式策略重构与训练克服这些难题。InterPReT允许用户在演示后审查和修改策略结构,例如添加或删除子任务,并提供反馈以纠正错误行为。该框架利用大型语言模型(LLMs)将用户反馈转化为可执行的策略修改,并采用基于奖励的强化学习(RL)进行策略优化。实验结果表明,InterPReT显著提升了从非专业人员演示中学习的成功率和效率,在各种机器人操作任务中表现优异。用户研究进一步证实了其易用性和有效性,为非专家用户赋能,使其能够轻松教授机器人复杂技能,从而推动模仿学习在现实世界中的应用。

📄 English Summary

InterPReT: Interactive Policy Restructuring and Training Enable Effective Imitation Learning from Laypersons

Imitation learning holds great promise for robotics, yet learning from layperson demonstrations presents significant challenges due to suboptimal demonstration quality, poorly structured policies, and the absence of effective feedback mechanisms. This paper introduces InterPReT, an interactive policy restructuring and training framework designed to address these issues. InterPReT empowers users to review and modify the policy structure post-demonstration, enabling them to add or remove subtasks and provide corrective feedback for erroneous behaviors. The framework leverages large language models (LLMs) to translate user feedback into actionable policy modifications and employs reward-based reinforcement learning (RL) for policy optimization. Experimental results demonstrate that InterPReT substantially improves the success rate and efficiency of learning from layperson demonstrations across a variety of robotic manipulation tasks. A user study further validates its usability and effectiveness, empowering non-expert users to easily teach complex skills to robots. This advancement facilitates broader real-world applications of imitation learning by making it more accessible and robust, ultimately bridging the gap between human intent and robotic execution in practical scenarios.

Powered by Cloudflare Workers + Payload CMS + Claude 3.5

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等