Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
trl-lib 's Collections
Preference datasets
Stepwise supervision datasets
Prompt-completion datasets
Prompt-only datasets
Unpaired preference datasets
Comparing DPO with IPO and KTO
Online-DPO

Preference datasets

updated Jan 8
Upvote
1

  • trl-lib/hh-rlhf-helpful-base

    Viewer • Updated Jan 8 • 46.2k • 330

  • trl-lib/lm-human-preferences-descriptiveness

    Viewer • Updated Jan 8 • 6.26k • 87 • 1

  • trl-lib/lm-human-preferences-sentiment

    Viewer • Updated Jan 8 • 6.26k • 98

  • trl-lib/rlaif-v

    Viewer • Updated Jan 8 • 83.1k • 238 • 3

  • trl-lib/tldr-preference

    Viewer • Updated Jan 8 • 179k • 575 • 2

  • trl-lib/ultrafeedback_binarized

    Viewer • Updated Sep 12, 2024 • 63.1k • 6.37k • 16
Upvote
1
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs