DavidAU
/

Gemma-3-1b-it-MAX-NEO-Imatrix-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

DavidAU commited on 9 days ago

Commit

6062d73

·

verified ·

1 Parent(s): 09e3757

Update README.md

Files changed (1) hide show

README.md +22 -1

README.md CHANGED Viewed

@@ -10,8 +10,29 @@ tags:
 license: apache-2.0
 ---
 <h2>Gemma-3-1b-it-MAX-NEO-Imatrix-GGUF</h2>
 Google's newest Gemma-3 model with Neo Imatrix and Maxed out quants.
-- more to follow -

 license: apache-2.0
 ---
+(quants uploading, examples to be added)
 <h2>Gemma-3-1b-it-MAX-NEO-Imatrix-GGUF</h2>
 Google's newest Gemma-3 model with Neo Imatrix and Maxed out quants.
+Recommend quants IQ3s / IQ4XS / Q4s for best results for creative.
+Recommend q5s/q6/q8 for general usage.
+Q8 is a maxed quant only, as imatrix has no effect on this quant.
+Note that IQ1 performance is low, whereas IQ2s are passable.
+"MAXED"
+This means the embed and output tensor are set at "BF16" (full precision) for all quants.
+This enhances quality, depth and general performance at the cost of a slightly larger quant.
+"NEO IMATRIX"
+A strong, in house built, imatrix dataset built by David_AU.
+- more to follow -