Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
leonardlin 's Collections
8b-class-japanese-models
speed
quantize
multilingual
sota
evals
tuning
rag
context
safety
image
reasoning
interprebility
vision
code
Prompting
embedding
prompt injection
TOREAD
architecture
synthetic-data
multimodal
Open LLMs
data
voice

architecture

updated May 24, 2024
Upvote
-

  • RecurrentGemma: Moving Past Transformers for Efficient Open Language Models

    Paper • 2404.07839 • Published Apr 11, 2024 • 48

  • Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence

    Paper • 2404.05892 • Published Apr 8, 2024 • 39

  • Jamba: A Hybrid Transformer-Mamba Language Model

    Paper • 2403.19887 • Published Mar 28, 2024 • 111

  • Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

    Paper • 2402.19427 • Published Feb 29, 2024 • 57

  • Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory

    Paper • 2405.08707 • Published May 14, 2024 • 33
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs