Remove vLLM FP8 Limitation

#2
by simon-mo - opened
Qwen org

This has been fixed as of latest v0.8.5 release πŸ™‡

ERROR 04-29 09:46:24 [core.py:396] ValueError("type fp8e4nv not supported in this architecture. The supported fp8 dtypes are ('fp8e4b15', 'fp8e5')")

i got this when running it on an A100..does it not use the marlin kernels by default?

jklj077 changed pull request status to merged
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment