ParEVO:通过自主进化合成不规则数据的代码:高性能并行性

📄 中文摘要

高性能应用程序的现代化要求从顺序计算向并行计算的转变,但并发编程的陡峭学习曲线使这一过程变得困难。对于不规则数据结构(如稀疏图、不平衡树和非均匀网格),这一挑战尤为突出,因为静态调度无法有效应对且数据依赖性不可预测。当前的大型语言模型(LLMs)在这些任务上往往表现不佳,生成的代码容易出现微妙的竞争条件、死锁和次优扩展。ParEVO框架旨在弥补这一差距,合成高性能的并行算法以处理不规则数据。该框架的贡献包括:Parlay-Instruct语料库,这是一个包含1382个示例的精心策划的数据集,专门用于支持这一目标。

📄 English Summary

ParEVO: Synthesizing Code for Irregular Data: High-Performance Parallelism through Agentic Evolution

The transition from sequential to parallel computing is crucial for modern high-performance applications, yet it is impeded by the steep learning curve associated with concurrent programming. This issue is particularly pronounced for irregular data structures, such as sparse graphs, unbalanced trees, and non-uniform meshes, where static scheduling fails and data dependencies are unpredictable. Current Large Language Models (LLMs) often perform poorly on these tasks, producing code that is susceptible to subtle race conditions, deadlocks, and sub-optimal scaling. ParEVO is introduced as a framework designed to synthesize high-performance parallel algorithms specifically for irregular data. Key contributions include the Parlay-Instruct Corpus, a curated dataset comprising 1,382 examples aimed at supporting this objective.

Powered by Cloudflare Workers + Payload CMS + Claude 3.5

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等