ModernBERT or DeBERTaV3? Examining Architecture and Data Influence on Transformer Encoder Models Performance Paper • 2504.08716 • Published 22 days ago • 10
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published Dec 18, 2024 • 149
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published Dec 18, 2024 • 149
Multitask Prompted Training Enables Zero-Shot Task Generalization Paper • 2110.08207 • Published Oct 15, 2021 • 2
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper • 2211.05100 • Published Nov 9, 2022 • 31
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published Dec 18, 2024 • 149
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published Dec 18, 2024 • 149
CamemBERT 2.0: A Smarter French Language Model Aged to Perfection Paper • 2411.08868 • Published Nov 13, 2024 • 13
Reducing the Footprint of Multi-Vector Retrieval with Minimal Performance Impact via Token Pooling Paper • 2409.14683 • Published Sep 23, 2024 • 11
Harvesting Textual and Structured Data from the HAL Publication Repository Paper • 2407.20595 • Published Jul 30, 2024 • 22
Three Bricks to Consolidate Watermarks for Large Language Models Paper • 2308.00113 • Published Jul 26, 2023 • 14