Leaderboard benchmark?

by djuna - opened Feb 8

djuna

Feb 8

I'm curious how is it compare to the original Mistral small.

PSM24

Feb 10

@ehartford I evaluated the MATH500 score for this:
Dolphin3-R1: 87%
Mistral-Small3: 70%

Cognitive Computations org Feb 10

I'm not engaging with huggingface leaderboard.
They can eval it, or not. Not my concern.

ehartford changed discussion status to closed Feb 10

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment