基于视觉变换器的胶囊内镜视频罕见疾病检测

📄 中文摘要

该研究针对胶囊内镜视频(CEV)中的多标签分类任务,参与了胃肠道竞赛。利用基于变换器的深度学习网络进行调优,采用Google视觉变换器(ViT)作为基础模型,批量大小为16,输入分辨率为224 x 224。总共对17个标签进行分类,包括口腔、食道、胃、小肠、结肠、Z线、幽门、回盲瓣、活动性出血、血管扩张、血液、侵蚀、红斑、血红素、淋巴管扩张、息肉和溃疡。在三个视频的测试数据集中,整体mAP @0.5为0.0205,整体mAP @0.95为0.0196。

📄 English Summary

RARE disease detection from Capsule Endoscopic Videos based on Vision Transformers

This research addresses the multi-label classification task from capsule endoscopic videos (CEV) as part of the Gastro Competition. A deep learning network based on Transformers has been fine-tuned for this purpose, using the Google Vision Transformer (ViT) as the base model with a batch size of 16 and an input resolution of 224 x 224. A total of 17 labels are classified, including mouth, esophagus, stomach, small intestine, colon, z-line, pylorus, ileocecal valve, active bleeding, angiectasia, blood, erosion, erythema, hematin, lymphangioectasis, polyp, and ulcer. For the test dataset consisting of three videos, the overall mAP @0.5 is 0.0205, while the overall mAP @0.95 is 0.0196.

Powered by Cloudflare Workers + Payload CMS + Claude 3.5

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等