sandman4
/

Qwen3-32B-GPTQ-4bit

4-bit precision

Model card Files Files and versions Community

sandman4 commited on 13 days ago

Commit

c7a8f32

·

verified ·

1 Parent(s): 8975081

Update README.md

Files changed (1) hide show

README.md +0 -10

README.md CHANGED Viewed

@@ -31,16 +31,6 @@ model.quantize(calibration_dataset, batch_size=2)
 model.save(quant_path)
 ```
-## Running with VLLM
-```bash
-python -m vllm.entrypoints.openai.api_server \
-    --model /path/to/quantized_model \
-    --quantization gptq \
-    --dtype half \
-    --max-model-len 8192
-```
 ## License
 See LICENSE.txt

 model.save(quant_path)
 ```
 ## License
 See LICENSE.txt