Leaderboard benchmark?

#5
by djuna - opened

I'm curious how is it compare to the original Mistral small.

@ehartford I evaluated the MATH500 score for this:
Dolphin3-R1: 87%
Mistral-Small3: 70%

Cognitive Computations org

I'm not engaging with huggingface leaderboard.
They can eval it, or not. Not my concern.

ehartford changed discussion status to closed
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment