AngelSlim:一个更易获取、全面且高效的大型模型压缩工具包
📄 中文摘要
AngelSlim 是由腾讯混元团队开发的一个全面且多功能的大型模型压缩工具包。该工具包整合了前沿算法,包括量化、推测解码、令牌剪枝和蒸馏,提供了一个统一的流程,简化了从模型压缩到工业级部署的过渡。为了实现高效加速,AngelSlim 集成了最先进的 FP8 和 INT8 后训练量化(PTQ)算法,并在超低位数领域进行了开创性研究,HY-1.8B-int2 成为首个工业可行的 2 位大型模型。此外,提出了一种与训练对齐的推测解码框架,兼容多种模型架构,进一步提升了模型压缩的效率和实用性。
📄 English Summary
AngelSlim: A more accessible, comprehensive, and efficient toolkit for large model compression
AngelSlim is a comprehensive and versatile toolkit for large model compression developed by the Tencent Hunyuan team. It consolidates cutting-edge algorithms such as quantization, speculative decoding, token pruning, and distillation, providing a unified pipeline that streamlines the transition from model compression to industrial-scale deployment. To facilitate efficient acceleration, AngelSlim integrates state-of-the-art FP8 and INT8 Post-Training Quantization (PTQ) algorithms alongside pioneering research in ultra-low-bit regimes, featuring HY-1.8B-int2 as the first industrially viable 2-bit large model. Furthermore, a training-aligned speculative decoding framework is proposed, compatible with various model architectures, enhancing the efficiency and practicality of model compression.
Powered by Cloudflare Workers + Payload CMS + Claude 3.5
数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等