Melvin56/UIGEN-T2-7B-GGUF

Original Model : Tesslate/UIGEN-T2-7B

Llama.cpp build: 5219 (7d3af70b)

I used imatrix to create all these quants using this Dataset.


CPU (AVX2) CPU (ARM NEON) Metal cuBLAS rocBLAS SYCL CLBlast Vulkan Kompute
K-quants โœ… โœ… โœ… โœ… โœ… โœ… โœ… ๐Ÿข5 โœ… ๐Ÿข5 โŒ
I-quants โœ… ๐Ÿข4 โœ… ๐Ÿข4 โœ… ๐Ÿข4 โœ… โœ… Partialยน โŒ โŒ โŒ
โœ…: feature works
๐Ÿšซ: feature does not work
โ“: unknown, please contribute if you can test it youself
๐Ÿข: feature is slow
ยน: IQ3_S and IQ1_S, see #5886
ยฒ: Only with -ngl 0
ยณ: Inference is 50% slower
โด: Slower than K-quants of comparable size
โต: Slower than cuBLAS/rocBLAS on similar cards
โถ: Only q8_0 and iq4_nl
Downloads last month
400
GGUF
Model size
7.62B params
Architecture
qwen2
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Melvin56/UIGEN-T2-7B-GGUF

Base model

Qwen/Qwen2.5-7B
Quantized
(5)
this model