Edit Models filters

Inference Providers

Nebius AI Studio

HF Inference API

Misc

Inference Endpoints

AutoTrain Compatible

text-generation-inference

4-bit precision

8-bit precision

Mixture of Experts

Misc with no match

text-embeddings-inference

Carbon Emissions

Models

330

Full-text search

Active filters: rlhf

sileod/deberta-v3-base-tasksource-nli

Zero-Shot Classification • Updated Aug 13, 2024 • 535k • 124

sileod/deberta-v3-large-tasksource-nli

Zero-Shot Classification • Updated Feb 17, 2024 • 4.93k • 36

sileod/mdeberta-v3-base-tasksource-nli

Zero-Shot Classification • Updated Oct 19, 2023 • 44 • 18

PKU-Alignment/beaver-7b-v1.0

Reinforcement Learning • Updated May 9, 2024 • 50 • 11

PKU-Alignment/beaver-dam-7b

Updated Jul 10, 2023 • 2.36k • 8

fnlp/moss-rlhf-policy-model-7B-en

Updated Jul 17, 2023 • 2

lightonai/alfred-40b-0723

Text Generation • Updated Aug 11, 2023 • 28 • 46

TheBloke/NeuralHermes-2.5-Mistral-7B-GGUF

Updated Nov 30, 2023 • 1.44k • 51

simonveitner/MathHermes-2.5-Mistral-7B

Text Generation • Updated Dec 2, 2023 • 28 • 1

joey00072/ToxicHermes-2.5-Mistral-7B

Text Generation • Updated Dec 16, 2023 • 38 • 21

argilla/distilabeled-OpenHermes-2.5-Mistral-7B

Text Generation • Updated Jan 17, 2024 • 15 • 32

argilla/CapybaraHermes-2.5-Mistral-7B

Updated Mar 4, 2024 • 20 • 69

tasksource/deberta-small-long-nli

Zero-Shot Classification • Updated Aug 28, 2024 • 36.5k • 42

TheBloke/CapybaraHermes-2.5-Mistral-7B-GGUF

Updated Jan 31, 2024 • 7.07k • 111

TheBloke/CapybaraHermes-2.5-Mistral-7B-GPTQ

Updated Jan 31, 2024 • 508 • 57

mradermacher/distilabeled-Hermes-2.5-Mistral-7B-GGUF

Updated Dec 16, 2024 • 44 • 1

mradermacher/beaver-7b-v3.0-GGUF

Reinforcement Learning • Updated Apr 1 • 383 • 1

stanfordnlp/SteamSHP-flan-t5-xl

Text2Text Generation • Updated Oct 10, 2023 • 36 • 43

stanfordnlp/SteamSHP-flan-t5-large

Text2Text Generation • Updated Oct 10, 2023 • 40 • 33

trl-lib/llama-7b-se-peft

Updated Apr 6, 2023 • 4

sileod/deberta-v3-large-tasksource-rlhf-reward-model

Text Classification • Updated Mar 28, 2023 • 65 • 11

trl-lib/llama-7b-se-rl-peft

Updated Apr 14, 2023 • 103

trl-lib/llama-7b-se-rm-peft

Updated Apr 6, 2023 • 8

toloka/gpt2-large-rl-prompt-writing

Text Generation • Updated Apr 21, 2023 • 13 • 3

AdamG012/chat-opt-1.3b-rlhf-actor-deepspeed

Text Generation • Updated Apr 25, 2023 • 23 • 5

AdamG012/chat-opt-1.3b-rlhf-critic-deepspeed

Text Generation • Updated Apr 25, 2023 • 11 • 3

AdamG012/chat-opt-1.3b-rlhf-actor-ema-deepspeed

Text Generation • Updated Apr 25, 2023 • 8 • 8

agi-css/socially-good-lm

Text Generation • Updated May 29, 2023 • 20 • 5

agi-css/hh-rlhf-sft

Text Generation • Updated Jun 1, 2023 • 9 • 3

agi-css/better-base

Text Generation • Updated Jun 1, 2023 • 6 • 5