Shuttle 3.5 — Q8_0 GGUF Quant

This repo contains a GGUF quantized version of ShuttleAI's Shuttle 3.5 model, a high-performance instruction-tuned variant of Qwen 3 32B. This quant was built for efficient local inference without sacrificing quality.

🔗 Base Model

Original: shuttleai/shuttle-3.5
Parent architecture: Qwen 3 32B
Quantized by: Lex-au
Quantization format: GGUF Q8_0

📦 Model Size

Format	Size
Original (safetensors, F16)	65.52 GB
Q8_0 (GGUF)	34.8 GB

Compression Ratio: ~47%
Size Reduction: ~18% absolute (30.72 GB saved)

🧪 Quality

Q8_0 is near-lossless, preserving almost all performance of the full-precision model.
Ideal for high-quality inference on capable consumer hardware.

🚀 Usage

Compatible with all major GGUF-supporting runtimes, including:

llama.cpp
KoboldCPP
text-generation-webui
llamafile
LM Studio

Example with llama.cpp:

./main -m shuttle-3.5.Q8_0.gguf --ctx-size 4096 --threads 16 --prompt "Describe the effects of quantum decoherence in plain English."

lex-au
/

shuttle-3.5-Q8_0-GGUF

Shuttle 3.5 — Q8_0 GGUF Quant

🔗 Base Model

📦 Model Size

🧪 Quality

🚀 Usage

Model tree for lex-au/shuttle-3.5-Q8_0-GGUF