Model Card for Model ID

This is a int4_awq quantized checkpoint of bigcode/starcoder2-15b. It takes about 10GB of VRAM.

Running this Model

Run via docker with text-generation-interface:

docker run --gpus all --shm-size 64g -p 8080:80 -v ~/.cache/huggingface:/data \
    ghcr.io/huggingface/text-generation-inference:3.1.0 \
    --model-id shavera/starcoder2-15b-w4-autoawq-gemm
Downloads last month
18
Safetensors
Model size
2.66B params
Tensor type
F32
I32
FP16
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for shavera/starcoder2-15b-w4-autoawq-gemm

Quantized
(18)
this model