Remove vLLM FP8 Limitation
#2
by
simon-mo
- opened
This has been fixed as of latest v0.8.5 release π
ERROR 04-29 09:46:24 [core.py:396] ValueError("type fp8e4nv not supported in this architecture. The supported fp8 dtypes are ('fp8e4b15', 'fp8e5')")
i got this when running it on an A100..does it not use the marlin kernels by default?
jklj077
changed pull request status to
merged