临床症状检测中自主代理工作流程的优化不稳定性

出处: Optimization Instability in Autonomous Agentic Workflows for Clinical Symptom Detection

发布: 2026年2月19日

📄 中文摘要

自主代理工作流程通过迭代自我优化行为展现出巨大的潜力，但其失败模式尚未得到充分表征。研究了优化不稳定性这一现象，即持续的自主改进反而导致分类器性能下降。利用开源框架Pythia进行自动化提示优化，对三种不同流行率的临床症状进行了评估：呼吸急促（23%）、胸痛（12%）和长期新冠脑雾（3%）。观察到验证敏感性在迭代过程中在1.0和0.0之间波动，且其严重程度与类别流行率成反比。在3%流行率下，系统在检测到零个阳性病例的情况下实现了95%的准确率，这一失败模式凸显了优化过程中的潜在风险。

🏷️ 相关标签

#自主代理 #优化不稳定性 #临床症状 #分类器性能 #新冠

📄 English Summary

Optimization Instability in Autonomous Agentic Workflows for Clinical Symptom Detection

Autonomous agentic workflows that iteratively refine their behavior show great promise, yet their failure modes are not well characterized. This research investigates optimization instability, a phenomenon where continued autonomous improvement paradoxically degrades classifier performance. Using Pythia, an open-source framework for automated prompt optimization, three clinical symptoms with varying prevalence were evaluated: shortness of breath (23%), chest pain (12%), and Long COVID brain fog (3%). Validation sensitivity was observed to oscillate between 1.0 and 0.0 across iterations, with severity inversely proportional to class prevalence. At a prevalence of 3%, the system achieved 95% accuracy while detecting zero positive cases, highlighting the potential risks inherent in the optimization process.

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等

📄 中文摘要

🏷️ 相关标签

📄 English Summary

Optimization Instability in Autonomous Agentic Workflows for Clinical Symptom Detection

🏷️ Related Tags

📚 相关文章

AI 编程创造了新一类创作者。我就是其中之一。

人工智能成为我学习的助手

Claude CLI "泄露": 没有人赢，AI 仍然幻觉，企业仍在犯同样的错误