cooleel
's Collections
Document Parsing Unveiled: Techniques, Challenges, and Prospects for
Structured Information Extraction
Paper
•
2410.21169
•
Published
•
30
LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via
Hybrid Architecture
Paper
•
2409.02889
•
Published
•
54
M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page
Multi-document Understanding
Paper
•
2411.04952
•
Published
•
29
Contextual Document Embeddings
Paper
•
2410.02525
•
Published
•
22
PDF-WuKong: A Large Multimodal Model for Efficient Long PDF Reading with
End-to-End Sparse Sampling
Paper
•
2410.05970
•
Published
READoc: A Unified Benchmark for Realistic Document Structured Extraction
Paper
•
2409.05137
•
Published
Xmodel-1.5: An 1B-scale Multilingual LLM
Paper
•
2411.10083
•
Published
•
14
M-Longdoc: A Benchmark For Multimodal Super-Long Document Understanding
And A Retrieval-Aware Tuning Framework
Paper
•
2411.06176
•
Published
•
45
CC1984/mall_receipt_extraction_dataset
Viewer
•
Updated
•
1.8k
•
144
•
1
VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal
Retrieval-Augmented Generation
Paper
•
2412.10704
•
Published
•
15
DoPTA: Improving Document Layout Analysis using Patch-Text Alignment
Paper
•
2412.12902
•
Published
Predicting the Original Appearance of Damaged Historical Documents
Paper
•
2412.11634
•
Published
•
4
SynFinTabs: A Dataset of Synthetic Financial Tables for Information and
Table Extraction
Paper
•
2412.04262
•
Published
•
5
TableBench: A Comprehensive and Complex Benchmark for Table Question
Answering
Paper
•
2408.09174
•
Published
•
52
A Token-level Text Image Foundation Model for Document Understanding
Paper
•
2503.02304
•
Published
•
4
More Documents, Same Length: Isolating the Challenge of Multiple
Documents in RAG
Paper
•
2503.04388
•
Published
•
15
SAGE: A Framework of Precise Retrieval for RAG
Paper
•
2503.01713
•
Published
•
5
IFIR: A Comprehensive Benchmark for Evaluating Instruction-Following in
Expert-Domain Information Retrieval
Paper
•
2503.04644
•
Published
•
20