view article Article LeRobot goes to driving school: World’s largest open-source self-driving dataset 6 days ago • 48
SYNTHETIC-1 Collection A collection of tasks & verifiers for reasoning datasets • 9 items • Updated 24 days ago • 49
view article Article Introducing Three New Serverless Inference Providers: Hyperbolic, Nebius AI Studio, and Novita 🔥 27 days ago • 93
view article Article From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub Feb 12 • 49
view article Article Mastering Long Contexts in LLMs with KVPress By nvidia and 1 other • Jan 23 • 64
view article Article Explore, Curate and Vector Search Any Hugging Face Dataset with Nomic Atlas By MaxNomic and 4 others • Jan 23 • 30
high-quality Chinese training datasets Collection a suite of high-quality Chinese datasets, used for pretraining, fine-tuning or preference alignment. And the models trained on these datasets. • 13 items • Updated 5 days ago • 11
view article Article Finding Moroccan Arabic (Darija) in Fineweb 2 By omarkamali and 3 others • Dec 8, 2024 • 23