Excellent for its size!

by MurphyGM - opened Feb 13

Feb 13

There's a distinct lack of sparse MoE models in the current generation of "lite" LLMs. OLMoE is on par with 7B models while being many times faster, just what the doctor ordered.

I would really like a 3B/21B variant! If there was one and the quality extrapolated, I'd probably use it over any other local model.

MurphyGM changed discussion title from Good for its size to Excellent for its size! Feb 13

amanrangapur

Ai2 org Feb 24

Hey @MurphyGM , glad to hear you’re excited about OLMoE. A 3B or 21B variant sounds interesting, I’ll pass along to the team.

amanrangapur changed discussion status to closed Feb 24

MurphyGM

Feb 24

Hey @MurphyGM , glad to hear you’re excited about OLMoE. A 3B or 21B variant sounds interesting, I’ll pass along to the team.

Thanks, I believe there is great potential to small MoE models and they're a niche not covered by anything current (other than DeepSeek-V2-Lite), so an OLMoE variant with twice or triple the parameters would be very welcome.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment