VidEgoThink: Assessing Egocentric Video Understanding Capabilities for Embodied AI Paper • 2410.11623 • Published Oct 15, 2024 • 49
AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information? Paper • 2412.02611 • Published Dec 3, 2024 • 24
VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents Paper • 2410.10594 • Published Oct 14, 2024 • 27
VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents Paper • 2410.10594 • Published Oct 14, 2024 • 27
Enhancing Chat Language Models by Scaling High-quality Instructional Conversations Paper • 2305.14233 • Published May 23, 2023 • 6
VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents Paper • 2410.10594 • Published Oct 14, 2024 • 27
Won't Get Fooled Again: Answering Questions with False Premises Paper • 2307.02394 • Published Jul 5, 2023
RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework Paper • 2408.01262 • Published Aug 2, 2024 • 1