unslothai (Unsloth Backup Account)

posted an update 3 days ago

Post

1451

💜 Qwen3 128K Context Length: We've released Dynamic 2.0 GGUFs + 4-bit safetensors!
Fixed: Now works on any inference engine and fixed issues with the chat template.
Qwen3 GGUFs:
30B-A3B: unsloth/Qwen3-30B-A3B-GGUF
235-A22B: unsloth/Qwen3-235B-A22B-GGUF
32B: unsloth/Qwen3-32B-GGUF

Read our guide on running Qwen3 here: https://docs.unsloth.ai/basics/qwen3-how-to-run-and-finetune

128K Context Length:
30B-A3B: unsloth/Qwen3-30B-A3B-128K-GGUF
235-A22B: unsloth/Qwen3-235B-A22B-128K-GGUF
32B: unsloth/Qwen3-32B-128K-GGUF

All Qwen3 uploads: unsloth/qwen3-680edabfb790c8c34a242f95

danielhanchen

posted an update 8 days ago

Post

5675

🦥 Introducing Unsloth Dynamic v2.0 GGUFs!
Our v2.0 quants set new benchmarks on 5-shot MMLU and KL Divergence, meaning you can now run & fine-tune quantized LLMs while preserving as much accuracy as possible.

Llama 4: unsloth/Llama-4-Scout-17B-16E-Instruct-GGUF
DeepSeek-R1: unsloth/DeepSeek-R1-GGUF-UD
Gemma 3: unsloth/gemma-3-27b-it-GGUF

We made selective layer quantization much smarter. Instead of modifying only a subset of layers, we now dynamically quantize all layers so every layer has a different bit. Now, our dynamic method can be applied to all LLM architectures, not just MoE's.

Blog with Details: https://docs.unsloth.ai/basics/dynamic-v2.0

All our future GGUF uploads will leverage Dynamic 2.0 and our hand curated 300K–1.5M token calibration dataset to improve conversational chat performance.

For accurate benchmarking, we built an evaluation framework to match the reported 5-shot MMLU scores of Llama 4 and Gemma 3. This allowed apples-to-apples comparisons between full-precision vs. Dynamic v2.0, QAT and standard iMatrix quants.

Dynamic v2.0 aims to minimize the performance gap between full-precision models and their quantized counterparts.

danielhanchen

posted an update 26 days ago

Post

4854

You can now run Llama 4 on your own local device! 🦙
Run our Dynamic 1.78-bit and 2.71-bit Llama 4 GGUFs:
unsloth/Llama-4-Scout-17B-16E-Instruct-GGUF

You can run them on llama.cpp and other inference engines. See our guide here: https://docs.unsloth.ai/basics/tutorial-how-to-run-and-fine-tune-llama-4

1 reply

·

danielhanchen

posted an update about 1 month ago

Post

3455

You can now run DeepSeek-V3-0324 on your own local device!
Run our Dynamic 2.42 and 2.71-bit DeepSeek GGUFs: unsloth/DeepSeek-V3-0324-GGUF

You can run them on llama.cpp and other inference engines. See our guide here: https://docs.unsloth.ai/basics/tutorial-how-to-run-deepseek-v3-0324-locally

danielhanchen

updated a model 2 months ago

unslothai/Phi-4-mini-instruct

Updated Feb 28 • 2

danielhanchen

published a model 2 months ago

unslothai/Phi-4-mini-instruct

Updated Feb 28 • 2

danielhanchen

updated 2 models 2 months ago

unslothai/Llama-3.2-1B-Instruct

Updated Feb 25 • 2

unslothai/Llama-3.2-1B-Instruct-unsloth-bnb-4bit

Updated Feb 25 • 6

danielhanchen

published 2 models 2 months ago

unslothai/Llama-3.2-1B-Instruct-unsloth-bnb-4bit

Updated Feb 25 • 6

unslothai/Llama-3.2-1B-Instruct

Updated Feb 25 • 2

danielhanchen

posted an update 3 months ago

Post

3125

I uploaded DeepSeek R1 GGUFs!

unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF
unsloth/DeepSeek-R1-Distill-Llama-70B-GGUF
2bit for MoE: unsloth/DeepSeek-R1-GGUF
unsloth/DeepSeek-R1-Zero-GGUF

More at unsloth/deepseek-r1-all-versions-678e1c48f5d2fce87892ace5

danielhanchen

posted an update 4 months ago

Post

4743

We fixed many bugs in Phi-4 & uploaded fixed GGUF + 4-bit versions! ✨

Our fixed versions are even higher on the Open LLM Leaderboard than Microsoft's!

GGUFs: unsloth/phi-4-GGUF
Dynamic 4-bit: unsloth/phi-4-unsloth-bnb-4bit

You can also now finetune Phi-4 for free on Colab: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Phi_4-Conversational.ipynb

Read our blogpost for more details on bug fixes etc: https://unsloth.ai/blog/phi4

danielhanchen

posted an update 4 months ago

Post

3224

Deepseek V3, including GGUF + bf16 versions are now uploaded!

Includes 2, 3, 4, 5, 6 and 8-bit quantized versions.

GGUFs: unsloth/DeepSeek-V3-GGUF
bf16: unsloth/DeepSeek-V3-bf16

Min. hardware requirements to run: 48GB RAM + 250GB of disk space for 2-bit.

See how to run them with examples and the full collection: unsloth/deepseek-v3-all-versions-677cf5cfd7df8b7815fc723c

danielhanchen

posted an update 5 months ago

Post

1584

I uploaded GGUFs, 4bit bitsandbytes and full 16bit precision weights for Llama 3.3 70B Instruct are here: unsloth/llama-33-all-versions-67535d7d994794b9d7cf5e9f

You can also finetune Llama 3.3 70B in under 48GB of VRAM with Unsloth!
GGUFs: unsloth/Llama-3.3-70B-Instruct-GGUF
BnB 4bit: unsloth/Llama-3.3-70B-Instruct-bnb-4bit
16bit: unsloth/Llama-3.3-70B-Instruct

1 reply

·

danielhanchen

posted an update 5 months ago

Post

1471

Vision finetuning is in 🦥Unsloth! You can now finetune Llama 3.2, Qwen2 VL, Pixtral and all Llava variants up to 2x faster and with up to 70% less VRAM usage! Colab to finetune Llama 3.2: https://colab.research.google.com/drive/1j0N4XTY1zXXy7mPAhOC1_gMYZ2F2EBlk?usp=sharing

1 reply

·

danielhanchen

updated 5 models 10 months ago

Unsloth Backup Account

AI & ML interests

unslothai's activity

unslothai/Phi-4-mini-instruct

unslothai/Phi-4-mini-instruct

unslothai/Llama-3.2-1B-Instruct

unslothai/Llama-3.2-1B-Instruct-unsloth-bnb-4bit

unslothai/Llama-3.2-1B-Instruct-unsloth-bnb-4bit

unslothai/Llama-3.2-1B-Instruct

unslothai/9

unslothai/8

unslothai/7

unslothai/6

unslothai/5

AI & ML interests

Team members 1

unslothai's activity