metadata

license: apache-2.0
base_model:
  - Qwen/Qwen2.5-0.5B-Instruct
datasets:
  - agentlans/common-crawl-sample
  - bigcode/the-stack-smol-xl
  - open-thoughts/OpenThoughts-Unverified-173k
  - cognitivecomputations/dolphin-r1
tags:
  - draft
  - speculative-decoding
language:
  - zho
  - eng
  - fra
  - spa
  - por
  - deu
  - ita
  - rus
  - jpn
  - kor
  - vie
  - tha
  - ara

A 0.5B parameter draft (speculative decoding) model for use with deepseek-ai/DeepSeek-V3-0324.

See jukofyork/DeepSeek-V3-0324-DRAFT-0.5B-v1.0 for the non-GGUF version, and a detailed explanation of how the model was created.

Without `imatrix`

Link	Type
DeepSeek-V3-0324-DRAFT-0.5B-BF16.gguf	BF16
DeepSeek-V3-0324-DRAFT-0.5B-F16.gguf	F16
DeepSeek-V3-0324-DRAFT-0.5B-Q8_0.gguf	Q8_0
DeepSeek-V3-0324-DRAFT-0.5B-Q6_K.gguf	Q6_K
DeepSeek-V3-0324-DRAFT-0.5B-Q5_K_M.gguf	Q5_K_M
DeepSeek-V3-0324-DRAFT-0.5B-Q5_K_S.gguf	Q5_K_S
DeepSeek-V3-0324-DRAFT-0.5B-Q4_K_M.gguf	Q4_K_M
DeepSeek-V3-0324-DRAFT-0.5B-Q4_K_S.gguf	Q4_K_S
DeepSeek-V3-0324-DRAFT-0.5B-IQ4_NL.gguf	IQ4_NL
DeepSeek-V3-0324-DRAFT-0.5B-IQ4_XS.gguf	IQ4_XS
DeepSeek-V3-0324-DRAFT-0.5B-Q5_1.gguf	Q5_1
DeepSeek-V3-0324-DRAFT-0.5B-Q5_0.gguf	Q5_0
DeepSeek-V3-0324-DRAFT-0.5B-Q4_1.gguf	Q4_1
DeepSeek-V3-0324-DRAFT-0.5B-Q4_0.gguf	Q4_0

With `imatrix`

Link	Type
DeepSeek-V3-0324-DRAFT-0.5B-iQ6_K.gguf	Q6_K
DeepSeek-V3-0324-DRAFT-0.5B-iQ5_K_M.gguf	Q5_K_M
DeepSeek-V3-0324-DRAFT-0.5B-iQ5_K_S.gguf	Q5_K_S
DeepSeek-V3-0324-DRAFT-0.5B-iQ4_K_M.gguf	Q4_K_M
DeepSeek-V3-0324-DRAFT-0.5B-iQ4_K_S.gguf	Q4_K_S
DeepSeek-V3-0324-DRAFT-0.5B-iIQ4_NL.gguf	IQ4_NL
DeepSeek-V3-0324-DRAFT-0.5B-iIQ4_XS.gguf	IQ4_XS
DeepSeek-V3-0324-DRAFT-0.5B-iQ5_1.gguf	Q5_1
DeepSeek-V3-0324-DRAFT-0.5B-iQ5_0.gguf	Q5_0
DeepSeek-V3-0324-DRAFT-0.5B-iQ4_1.gguf	Q4_1
DeepSeek-V3-0324-DRAFT-0.5B-iQ4_0.gguf	Q4_0

See DeepSeek-R1-DRAFT-0.5B-v1.0-GGUF for detailed PPL statistics and recommendations on which quant to use, etc.

I have included the imatrix file used to generate the Q4_0-Q6_K quants, along with the 1MB sample of the fine-tuning data used to create it.

Without imatrix

With imatrix

Without `imatrix`

With `imatrix`