Text Generation
Transformers
Safetensors
English
olmoe
conversational

Excellent for its size!

#1
by MurphyGM - opened

There's a distinct lack of sparse MoE models in the current generation of "lite" LLMs. OLMoE is on par with 7B models while being many times faster, just what the doctor ordered.

I would really like a 3B/21B variant! If there was one and the quality extrapolated, I'd probably use it over any other local model.

MurphyGM changed discussion title from Good for its size to Excellent for its size!

Hey @MurphyGM , glad to hear you’re excited about OLMoE. A 3B or 21B variant sounds interesting, I’ll pass along to the team.

amanrangapur changed discussion status to closed

Hey @MurphyGM , glad to hear you’re excited about OLMoE. A 3B or 21B variant sounds interesting, I’ll pass along to the team.

Thanks, I believe there is great potential to small MoE models and they're a niche not covered by anything current (other than DeepSeek-V2-Lite), so an OLMoE variant with twice or triple the parameters would be very welcome.

Sign up or log in to comment