Qwen2.5 7B Instruct 1M by Qwen

Model creator: Qwen
Original model: Qwen2.5-7B-Instruct-1M

Prompt format

<|im_start|>system
{system_prompt}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant

Download a file (not the whole branch) from below:

Filename Quant type File Size Split Description
Qwen2.5-7B-Instruct-1M-F32.gguf f32 30.5 GB false Full F32 weights.
Qwen2.5-7B-Instruct-1M-F16.gguf f16 15.24 GB false Full F16 weights.
Qwen2.5-7B-Instruct-1M-Q8_0.gguf Q8_0 8.10 GB false Extremely high quality, generally unneeded but max available quant.
Qwen2.5-7B-Instruct-1M-Q6_K.gguf Q6_K 6.25 GB false Very high quality, near perfect, recommended.
Qwen2.5-7B-Instruct-1M-Q5_K_M.gguf Q5_K_M 5.44 GB false High quality, recommended.
Qwen2.5-7B-Instruct-1M-Q5_K_S.gguf Q5_K_S 5.32 GB false High quality, recommended.
Qwen2.5-7B-Instruct-1M-Q4_1.gguf Q4_1 4.87 GB false Legacy format, similar performance to Q4_K_S but with improved tokens/watt on Apple silicon.
Qwen2.5-7B-Instruct-1M-Q4_K_M.gguf Q4_K_M 4.68 GB false Good quality, default size for most use cases, recommended.
Qwen2.5-7B-Instruct-1M-Q4_K_S.gguf Q4_K_S 4.46 GB false Slightly lower quality with more space savings, recommended.
Qwen2.5-7B-Instruct-1M-Q4_0.gguf Q4_0 4.43 GB false Legacy format, offers online repacking for ARM and AVX CPU inference.
Qwen2.5-7B-Instruct-1M-Q3_K_L.gguf Q3_K_L 4.09 GB false Lower quality but usable, good for low RAM availability.
Qwen2.5-7B-Instruct-1M-Q3_K_M.gguf Q3_K_M 3.81 GB false Low quality.
Qwen2.5-7B-Instruct-1M-Q3_K_S.gguf Q3_K_S 3.49 GB false Low quality, not recommended.
Qwen2.5-7B-Instruct-1M-Q2_K.gguf Q2_K 3.02 GB false Very low quality but surprisingly usable.

Technical Details

Supports a context length of up to 1M tokens.

Significantly improved performance in handling long-context tasks while maintaining its capability in short tasks.

Accuracy degradation may occur for sequences exceeding 262,144 tokens until improved support is added.

For more information, check their blog here.

Downloading using huggingface-cli

Click to view download instructions

First, make sure you have hugginface-cli installed:

pip install -U "huggingface_hub[cli]"

Then, you can target the specific file you want:

huggingface-cli download BabaK07/Qwen2.5-7b-Instruct-1M-Q4_K_M-gguf --include "Qwen2.5-7b-Instruct-1M-Q4_K_M.gguf" --local-dir ./

Special thanks

๐Ÿ™ Special thanks to Georgi Gerganov and the whole team working on llama.cpp for making all of this possible.

Downloads last month
190
GGUF
Model size
7.62B params
Architecture
qwen2
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for BabaK07/Qwen2.5-7B-Instruct-1M-GGUF

Base model

Qwen/Qwen2.5-7B
Quantized
(65)
this model

Collection including BabaK07/Qwen2.5-7B-Instruct-1M-GGUF