Qwen3-8B-NEO-Imatrix-Max-GGUF
NEO Imatrix Quants of new "Qwen 3 - 8B" model with MAX "output tensor" at BF16 to improve reasoning / output generation.
NEO Imatrix dataset was generated in house.
Imatrix effect will be stronger, the lower the quant you use with IQ4XS/IQ4NL being the best balanced quant for quality and Imatrix effect.
These quants will also be the strongest for creative use cases.
For stronger reasoning use higher quants.
Q8_0 quant is maxed only, as Imatrix has no effect on this quant.
F16 is full precision.
Context Length: 32 K + 8K output generation. (can be extended to 128k)
NOTE - Jinja Template / Template to Use with this Model:
If you are having issues with Jinja "auto template", use CHATML template.
OR (LMSTUDIO users / option)
Update the Jinja Template (go to this site, template-> copy the "Jinja template" and then paste.)
[ https://lmstudio.ai/neil/qwen3-thinking ]
Other Notes:
Reasoning is ON by default in this model, and model will auto-generate "think" block(s).
For benchmarks, usage info, settings please see org model card here:
[ https://huggingface.co/Qwen/Qwen3-8B ]
[ Model card, and examples to follow. ]
- Downloads last month
- 0