Is this a QAT model?
#2
by
Downtown-Case
- opened
Is this (and the other series of Qwen FP8 models) natively trained with FP8, or enhanced with quantization-aware training?
In other words, would we get the same results locally converting the Bfloat16 model to FP8? Or is this one more optimized somehow?