LatentAM：实时、大规模潜在高斯注意力映射通过在线字典学习

出处: LatentAM: Real-Time, Large-Scale Latent Gaussian Attention Mapping via Online Dictionary Learning

发布: 2026年2月16日

📄 中文摘要

LatentAM 是一个在线 3D 高斯喷溅（3DGS）映射框架，旨在从流式 RGB-D 观测中构建可扩展的潜在特征图，以实现开放词汇的机器人感知。该框架提出了一种在线字典学习方法，避免了使用特定模型的解码器来提取高维视觉-语言模型（VLM）嵌入，具有模型无关性和无预训练的优点，能够在测试时与不同的 VLM 进行即插即用的集成。具体而言，该方法将每个高斯原语与一个紧凑的查询向量关联，该向量可以通过带有可学习字典的注意力机制转换为近似的 VLM 嵌入。字典从流式观测中高效初始化，确保了实时处理能力。

🏷️ 相关标签

#潜在高斯映射 #在线字典学习 #机器人感知

📄 English Summary

LatentAM: Real-Time, Large-Scale Latent Gaussian Attention Mapping via Online Dictionary Learning

LatentAM is an online 3D Gaussian Splatting (3DGS) mapping framework designed to build scalable latent feature maps from streaming RGB-D observations for open-vocabulary robotic perception. It introduces an online dictionary learning approach that avoids the need for model-specific decoders to distill high-dimensional Vision-Language Model (VLM) embeddings, offering model-agnostic and pretraining-free advantages for plug-and-play integration with various VLMs at test time. Specifically, the method associates each Gaussian primitive with a compact query vector, which can be transformed into approximate VLM embeddings through an attention mechanism utilizing a learnable dictionary. The dictionary is efficiently initialized from streaming observations, ensuring real-time processing capabilities.

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等

📄 中文摘要

🏷️ 相关标签

📄 English Summary

LatentAM: Real-Time, Large-Scale Latent Gaussian Attention Mapping via Online Dictionary Learning

🏷️ Related Tags

📚 相关文章

AI 编程创造了新一类创作者。我就是其中之一。

人工智能成为我学习的助手

Claude CLI "泄露": 没有人赢，AI 仍然幻觉，企业仍在犯同样的错误