RuntimeError: b_zeros dim 1 = 2560 is not size_n / pack_factor = 320

#1
by antoniogr7 - opened

if self._has_torchbind_op_overload and _must_dispatch_in_python(args, kwargs):
1122 return _call_overload_packet_from_python(self, args, kwargs)
-> 1123 return self._op(*args, **(kwargs or {}))
1124
1125 # TODO: use this to make a dir

RuntimeError: b_zeros dim 1 = 2560 is not size_n / pack_factor = 320

I wasn't able to make it works with vLLM.

Any suggestion?

Mobius Labs GmbH org

Yeah Marlin breaks with this model. You can use GemLite instead:

pip install git+https://github.com/mobiusml/hqq/;
pip install git+https://github.com/mobiusml/gemlite/;
from hqq.utils.vllm import set_vllm_hqq_backend, VLLM_HQQ_BACKEND
set_vllm_hqq_backend(backend=VLLM_HQQ_BACKEND.GEMLITE)

Thanks for the very quick reply, it works now

antoniogr7 changed discussion status to closed
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment