--- license: apache-2.0 base_model: - Qwen/Qwen3-32B --- # Qwen3-32B Quantized Model 4-bit quantized version of [Qwen3-32B](https://huggingface.co/Qwen/Qwen3-32B) using gptqmodel. ## Quantization ```python from datasets import load_dataset from gptqmodel import GPTQModel, QuantizeConfig import sys model_id = sys.argv[1] quant_path = "quantized_model" # Load calibration data (1024 samples from C4) calibration_dataset = load_dataset( "allenai/c4", data_files="en/c4-train.00001-of-01024.json.gz", split="train" ).select(range(1024))["text"] # Configure and run quantization quant_config = QuantizeConfig(bits=4, group_size=128) model = GPTQModel.load(model_id, quant_config) model.quantize(calibration_dataset, batch_size=2) model.save(quant_path) ``` ## License Apache-v2. See LICENSE.txt