BabaK07/Qwen2.5-7B-Instruct-1M-GGUF

Qwen2.5 7B Instruct 1M by Qwen

Model creator: Qwen
Original model: Qwen2.5-7B-Instruct-1M

<|im_start|>system
{system_prompt}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant

Filename	Quant type	File Size	Split	Description
Qwen2.5-7B-Instruct-1M-F32.gguf	f32	30.5 GB	false	Full F32 weights.
Qwen2.5-7B-Instruct-1M-F16.gguf	f16	15.24 GB	false	Full F16 weights.
Qwen2.5-7B-Instruct-1M-Q8_0.gguf	Q8_0	8.10 GB	false	Extremely high quality, generally unneeded but max available quant.
Qwen2.5-7B-Instruct-1M-Q6_K.gguf	Q6_K	6.25 GB	false	Very high quality, near perfect, recommended.
Qwen2.5-7B-Instruct-1M-Q5_K_M.gguf	Q5_K_M	5.44 GB	false	High quality, recommended.
Qwen2.5-7B-Instruct-1M-Q5_K_S.gguf	Q5_K_S	5.32 GB	false	High quality, recommended.
Qwen2.5-7B-Instruct-1M-Q4_1.gguf	Q4_1	4.87 GB	false	Legacy format, similar performance to Q4_K_S but with improved tokens/watt on Apple silicon.
Qwen2.5-7B-Instruct-1M-Q4_K_M.gguf	Q4_K_M	4.68 GB	false	Good quality, default size for most use cases, recommended.
Qwen2.5-7B-Instruct-1M-Q4_K_S.gguf	Q4_K_S	4.46 GB	false	Slightly lower quality with more space savings, recommended.
Qwen2.5-7B-Instruct-1M-Q4_0.gguf	Q4_0	4.43 GB	false	Legacy format, offers online repacking for ARM and AVX CPU inference.
Qwen2.5-7B-Instruct-1M-Q3_K_L.gguf	Q3_K_L	4.09 GB	false	Lower quality but usable, good for low RAM availability.
Qwen2.5-7B-Instruct-1M-Q3_K_M.gguf	Q3_K_M	3.81 GB	false	Low quality.
Qwen2.5-7B-Instruct-1M-Q3_K_S.gguf	Q3_K_S	3.49 GB	false	Low quality, not recommended.
Qwen2.5-7B-Instruct-1M-Q2_K.gguf	Q2_K	3.02 GB	false	Very low quality but surprisingly usable.

Supports a context length of up to 1M tokens.

Significantly improved performance in handling long-context tasks while maintaining its capability in short tasks.

Accuracy degradation may occur for sequences exceeding 262,144 tokens until improved support is added.

For more information, check their blog here.

Click to view download instructions

First, make sure you have hugginface-cli installed:

pip install -U "huggingface_hub[cli]"

Then, you can target the specific file you want:

huggingface-cli download BabaK07/Qwen2.5-7b-Instruct-1M-Q4_K_M-gguf --include "Qwen2.5-7b-Instruct-1M-Q4_K_M.gguf" --local-dir ./

🙏 Special thanks to Georgi Gerganov and the whole team working on llama.cpp for making all of this possible.