Qwen3-32B Quantized Model
4-bit quantized version of Qwen3-32B using gptqmodel.
Quantization
from datasets import load_dataset
from gptqmodel import GPTQModel, QuantizeConfig
import sys
model_id = sys.argv[1]
quant_path = "quantized_model"
# Load calibration data (1024 samples from C4)
calibration_dataset = load_dataset(
"allenai/c4",
data_files="en/c4-train.00001-of-01024.json.gz",
split="train"
).select(range(1024))["text"]
# Configure and run quantization
quant_config = QuantizeConfig(bits=4, group_size=128)
model = GPTQModel.load(model_id, quant_config)
model.quantize(calibration_dataset, batch_size=2)
model.save(quant_path)
License
Apache-v2. See LICENSE.txt
- Downloads last month
- 33
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
馃檵
Ask for provider support
Model tree for sandman4/Qwen3-32B-GPTQ-4bit
Base model
Qwen/Qwen3-32B