Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
imjliao 's Collections
Agent
Summarization
Reasoning
Prompt
Synthetic Data
Dialogue
Entity
Information Retrieval
QA
Document Information Extraction
Long Context
Document AI
Tool Use
Fine Tuning
MLLM
AIF
Models

Models

updated Apr 12, 2024
Upvote
-

  • JetMoE: Reaching Llama2 Performance with 0.1M Dollars

    Paper • 2404.07413 • Published Apr 11, 2024 • 39

  • Rho-1: Not All Tokens Are What You Need

    Paper • 2404.07965 • Published Apr 11, 2024 • 94

  • Jamba: A Hybrid Transformer-Mamba Language Model

    Paper • 2403.19887 • Published Mar 28, 2024 • 111

  • Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

    Paper • 2404.02258 • Published Apr 2, 2024 • 106

  • Learning to Route Among Specialized Experts for Zero-Shot Generalization

    Paper • 2402.05859 • Published Feb 8, 2024 • 5
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs