hugging-quants
/

Meta-Llama-3.1-70B-Instruct-GPTQ-INT4

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions Community

Resources

View closed (4)

`probability tensor contains either inf, nan or element < 0` when using multiple GPUs

#7 opened 5 months ago by

is this one fit for vllm deployment?

#4 opened 9 months ago by

8 bit version

#3 opened 10 months ago by