Any chance your team is working on a 4-bit Llama-3.2-90B-Vision-Instruct-quantized.w4a16 version?

#1
by mrhendrey - opened

Love the work that you do. Hoping you are going to put out some of the 4-bit quantized versions in the near future. thank you

Red Hat AI org

Appreciate it! Yes we are working on enabling quantization flows with calibration for VLMs

Appreciate it! Yes we are working on enabling quantization flows with calibration for VLMs

Have there been any updates on this? I noted that you have an 11B model with w4a16 but not a 90B model

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment