DFloat11
/

gemma-3-4b-it-DF11

lossless compression

70% size, 100% accuracy

Model card Files Files and versions Community

LeanQuant commited on 24 days ago

Commit

ff6c02c

·

verified ·

1 Parent(s): 68dce0d

Update README.md

Files changed (1) hide show

README.md +8 -2

README.md CHANGED Viewed

@@ -1,7 +1,13 @@
 ---
-base_model:
-- google/gemma-3-4b-it
 ---
 ## DFloat11 Compressed Model: `google/gemma-3-4b-it`
 This is a **losslessly compressed** version of [`google/gemma-3-4b-it`](https://huggingface.co/google/gemma-3-4b-it) using our custom **DFloat11** format. The outputs of this compressed model are **bit-for-bit identical** to the original BFloat16 model, while reducing GPU memory consumption by approximately **30%**.

 ---
+base_model: google/gemma-3-4b-pt
+base_model_relation: quantized
+tags:
+- dfloat11
+- df11
+- lossless compression
+- 70% size, 100% accuracy
 ---
 ## DFloat11 Compressed Model: `google/gemma-3-4b-it`
 This is a **losslessly compressed** version of [`google/gemma-3-4b-it`](https://huggingface.co/google/gemma-3-4b-it) using our custom **DFloat11** format. The outputs of this compressed model are **bit-for-bit identical** to the original BFloat16 model, while reducing GPU memory consumption by approximately **30%**.