Shuttle 3.5 β Q8_0 GGUF Quant
This repo contains a GGUF quantized version of ShuttleAI's Shuttle 3.5 model, a high-performance instruction-tuned variant of Qwen 3 32B. This quant was built for efficient local inference without sacrificing quality.
π Base Model
- Original: shuttleai/shuttle-3.5
- Parent architecture: Qwen 3 32B
- Quantized by: Lex-au
- Quantization format: GGUF Q8_0
π¦ Model Size
Format | Size |
---|---|
Original (safetensors, F16) | 65.52 GB |
Q8_0 (GGUF) | 34.8 GB |
Compression Ratio: ~47%
Size Reduction: ~18% absolute (30.72 GB saved)
π§ͺ Quality
- Q8_0 is near-lossless, preserving almost all performance of the full-precision model.
- Ideal for high-quality inference on capable consumer hardware.
π Usage
Compatible with all major GGUF-supporting runtimes, including:
llama.cpp
KoboldCPP
text-generation-webui
llamafile
LM Studio
Example with llama.cpp
:
./main -m shuttle-3.5.Q8_0.gguf --ctx-size 4096 --threads 16 --prompt "Describe the effects of quantum decoherence in plain English."
- Downloads last month
- 25
Hardware compatibility
Log In
to view the estimation
8-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support
Model tree for lex-au/shuttle-3.5-Q8_0-GGUF
Base model
shuttleai/shuttle-3.5