DavidAU
/

Qwen3-8B-NEO-Imatrix-Max-GGUF

Text Generation

Model card Files Files and versions Community

Qwen3-8B-NEO-Imatrix-Max-GGUF

NEO Imatrix Quants of new "Qwen 3 - 8B" model with MAX "output tensor" at BF16 to improve reasoning / output generation.

NEO Imatrix dataset was generated in house.

Imatrix effect will be stronger, the lower the quant you use with IQ4XS/IQ4NL being the best balanced quant for quality and Imatrix effect.

These quants will also be the strongest for creative use cases.

For stronger reasoning use higher quants.

Q8_0 quant is maxed only, as Imatrix has no effect on this quant.

F16 is full precision.

Context Length: 32 K + 8K output generation. (can be extended to 128k)

NOTE - Jinja Template / Template to Use with this Model:

If you are having issues with Jinja "auto template", use CHATML template.

OR (LMSTUDIO users / option)

Update the Jinja Template (go to this site, template-> copy the "Jinja template" and then paste.)

[ https://lmstudio.ai/neil/qwen3-thinking ]

Other Notes:

Reasoning is ON by default in this model, and model will auto-generate "think" block(s).

For benchmarks, usage info, settings please see org model card here:

[ https://huggingface.co/Qwen/Qwen3-8B ]

[ Model card, and examples to follow. ]

Downloads last month: 0

GGUF

Model size

8.19B params

Architecture

qwen3

Hardware compatibility

Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

6-bit

View +2 variants

Inference Providers NEW

Text Generation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for DavidAU/Qwen3-8B-NEO-Imatrix-Max-GGUF

Base model

Qwen/Qwen3-8B-Base

Finetuned

Quantized

(52)

this model

Collections including DavidAU/Qwen3-8B-NEO-Imatrix-Max-GGUF

D_AU - Thinking / Reasoning Models - Reg and MOEs.

QwQ,DeepSeek, EXONE, DeepHermes, and others "thinking/reasoning" AIs / LLMs in regular model type, MOE (mix of experts), and Hybrid model formats. • 67 items • Updated about 1 hour ago • 5

D_AU - Qwen 3 / 2.5 Reasoning/Thinking REG + MOEs.

Qwen 3 / 2.5 Reasoning/Thinking models in both regular and MOE configuration built by me. Source code links also below too. • 25 items • Updated about 1 hour ago • 1

Qwen 3 - Horror / Neo Imatrix / Max Quants

Qwen .6B, 1.7B, 4B, 8B, 14B and others quanted using Neo / Horror imatrix datasets, and additional optimizations to improve reasoning/output gen. • 10 items • Updated 43 minutes ago • 1