Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
flowaicom 's Collections
Flow Judge Datasets v0
Flow LM Judge Evaluation Datasets
Flow Judge Datasets v0.1
Flow Judge v0.1 held-out test datasets
Flow-Judge-v0.1 out-of-domain evaluation datasets
Flow-Judge-v0.1

Flow-Judge-v0.1 out-of-domain evaluation datasets

updated Sep 13, 2024

This collection contains out-of-domain datasets used to evaluate the generalization capabilities of Flow-Judge-v0.1

Upvote
1

  • flowaicom/Feedback-Bench

    Viewer • Updated Sep 14, 2024 • 1k • 19

  • flowaicom/HaluEval

    Viewer • Updated Sep 14, 2024 • 10k • 85

  • flowaicom/PubMedQA

    Viewer • Updated Sep 14, 2024 • 1k • 16

  • flowaicom/covid_qa

    Viewer • Updated Sep 14, 2024 • 1k • 23

  • flowaicom/RAGTruth_test

    Viewer • Updated Sep 14, 2024 • 2.7k • 40 • 1
Upvote
1
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs