RuntimeError: b_zeros dim 1 = 2560 is not size_n / pack_factor = 320
#1
by
antoniogr7
- opened
if self._has_torchbind_op_overload and _must_dispatch_in_python(args, kwargs):
1122 return _call_overload_packet_from_python(self, args, kwargs)
-> 1123 return self._op(*args, **(kwargs or {}))
1124
1125 # TODO: use this to make a dir
RuntimeError: b_zeros dim 1 = 2560 is not size_n / pack_factor = 320
I wasn't able to make it works with vLLM.
Any suggestion?
Yeah Marlin breaks with this model. You can use GemLite instead:
pip install git+https://github.com/mobiusml/hqq/;
pip install git+https://github.com/mobiusml/gemlite/;
from hqq.utils.vllm import set_vllm_hqq_backend, VLLM_HQQ_BACKEND
set_vllm_hqq_backend(backend=VLLM_HQQ_BACKEND.GEMLITE)
Thanks for the very quick reply, it works now
antoniogr7
changed discussion status to
closed