Excellent for its size!
There's a distinct lack of sparse MoE models in the current generation of "lite" LLMs. OLMoE is on par with 7B models while being many times faster, just what the doctor ordered.
I would really like a 3B/21B variant! If there was one and the quality extrapolated, I'd probably use it over any other local model.
Hey @MurphyGM , glad to hear you’re excited about OLMoE. A 3B or 21B variant sounds interesting, I’ll pass along to the team.
Hey @MurphyGM , glad to hear you’re excited about OLMoE. A 3B or 21B variant sounds interesting, I’ll pass along to the team.
Thanks, I believe there is great potential to small MoE models and they're a niche not covered by anything current (other than DeepSeek-V2-Lite), so an OLMoE variant with twice or triple the parameters would be very welcome.