papers

SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights

Paper • 2410.09008 • Published Oct 11, 2024 • 17

Browse and submit large language model evaluations

onnx-community/Kokoro-82M-ONNX

Text-to-Speech • Updated Feb 7 • 20.4k • 130

HuggingFaceTB/SmolVLM2-256M-Video-Instruct

Image-Text-to-Text • Updated 13 days ago • 6.25k • 42

Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning

Paper • 2502.14768 • Published 27 days ago • 46

homebrewltd/AlphaMaze-v0.2-1.5B

Text Generation • Updated 23 days ago • 2.48k • • 91

qihoo360/TinyR1-32B-Preview

Text Generation • Updated 9 days ago • 6.78k • 321

Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment

Paper • 2502.16894 • Published 23 days ago • 27

microsoft/Phi-4-multimodal-instruct

Automatic Speech Recognition • Updated 6 days ago • 696k • 1.18k

microsoft/Magma-8B

Image-Text-to-Text • Updated 14 days ago • 13.9k • 335

Physics of Language Models: Part 1, Context-Free Grammar

Paper • 2305.13673 • Published May 23, 2023 • 7

LoRA: Low-Rank Adaptation of Large Language Models

Paper • 2106.09685 • Published Jun 17, 2021 • 37

Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems

Paper • 2408.16293 • Published Aug 29, 2024 • 26

Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process

Paper • 2407.20311 • Published Jul 29, 2024 • 5

Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws

Paper • 2404.05405 • Published Apr 8, 2024 • 10

Reverse Training to Nurse the Reversal Curse

Paper • 2403.13799 • Published Mar 20, 2024 • 13

Physics of Language Models: Part 3.2, Knowledge Manipulation

Paper • 2309.14402 • Published Sep 25, 2023 • 7

Physics of Language Models: Part 3.1, Knowledge Storage and Extraction

Paper • 2309.14316 • Published Sep 25, 2023 • 8

yale-nlp/FOLIO

Viewer • Updated Dec 21, 2023 • 1.2k • 1.82k • 37

Qwen/QwQ-32B

Text Generation • Updated 8 days ago • 476k • • 2.37k

DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks

Paper • 2502.17157 • Published 23 days ago • 51

secemp9/TraceBack-12b

Text Generation • Updated 5 days ago • 1.14k • 26

saytes/SoT_DistilBERT

Text Classification • Updated 8 days ago • 1.41k • 2

Symbolic Mixture-of-Experts: Adaptive Skill-based Routing for Heterogeneous Reasoning

Paper • 2503.05641 • Published 12 days ago • 1

ds4sd/SmolDocling-256M-preview

Image-Text-to-Text • Updated about 16 hours ago • 7.67k • 458

Open FinLLM Leaderboard