多模态单目内窥镜深度与姿态估计的边缘引导自监督学习

📄 中文摘要

单目深度与姿态估计在结肠镜辅助导航的发展中发挥着重要作用,能够通过减少盲区、降低漏检或复发病变的风险以及减少不完全检查的可能性来改善筛查效果。然而,由于存在无纹理表面、复杂的光照模式、变形以及缺乏可靠的体内数据集等因素,这一任务仍然面临挑战。提出了PRISM(基于内在阴影和边缘图的姿态精炼)自监督学习框架,利用解剖学和光照先验来指导几何学习。该方法独特地结合了边缘检测和亮度解耦技术,以提高深度和姿态估计的准确性。

📄 English Summary

Multi-Modal Monocular Endoscopic Depth and Pose Estimation with Edge-Guided Self-Supervision

Monocular depth and pose estimation plays a crucial role in the advancement of colonoscopy-assisted navigation, enhancing screening by reducing blind spots, minimizing the risk of missed or recurrent lesions, and lowering the likelihood of incomplete examinations. However, this task remains challenging due to the presence of texture-less surfaces, complex illumination patterns, deformation, and a lack of reliable in-vivo datasets with ground truth. A self-supervised learning framework named PRISM (Pose-Refinement with Intrinsic Shading and edge Maps) is proposed, leveraging anatomical and illumination priors to guide geometric learning. This approach uniquely incorporates edge detection and luminance decoupling techniques to improve the accuracy of depth and pose estimation.

Powered by Cloudflare Workers + Payload CMS + Claude 3.5

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等