OptiML：一个端到端的程序合成与CUDA内核优化框架

出处: OptiML: An End-to-End Framework for Program Synthesis and CUDA Kernel Optimization

发布: 2026年2月16日

📄 中文摘要

生成高性能的CUDA内核面临着挑战，因为需要在噪声和昂贵的硬件反馈下导航低级变换的组合空间。尽管大型语言模型能够合成功能上正确的CUDA代码，但要实现竞争力的性能，需要对优化选择进行系统的探索和验证。OptiML被提出作为一个端到端框架，将自然语言意图或输入的CUDA代码映射到性能优化的CUDA内核，通过将内核优化形式化为验证下的搜索。OptiML由两个解耦的阶段组成。当输入为自然语言时，Mixture-of-Thoughts生成器（OptiML-G）作为内核实现的提议策略。

🏷️ 相关标签

#CUDA内核 #程序合成 #性能优化 #机器学习

📄 English Summary

OptiML: An End-to-End Framework for Program Synthesis and CUDA Kernel Optimization

Generating high-performance CUDA kernels remains a challenge due to the need to navigate a combinatorial space of low-level transformations under noisy and expensive hardware feedback. While large language models can synthesize functionally correct CUDA code, achieving competitive performance requires systematic exploration and verification of optimization choices. This research presents OptiML, an end-to-end framework that maps either natural-language intent or input CUDA code to performance-optimized CUDA kernels by formulating kernel optimization as search under verification. OptiML consists of two decoupled stages. When the input is natural language, a Mixture-of-Thoughts generator (OptiML-G) acts as a proposal policy over kernel implementation strategies.

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等

📄 中文摘要

🏷️ 相关标签

📄 English Summary

OptiML: An End-to-End Framework for Program Synthesis and CUDA Kernel Optimization

🏷️ Related Tags

📚 相关文章

AI 编程创造了新一类创作者。我就是其中之一。

人工智能成为我学习的助手

Claude CLI "泄露": 没有人赢，AI 仍然幻觉，企业仍在犯同样的错误