【Evaluation】Best practice for evaluating Qwen3 !!
#2
by
wangxingjun778
- opened
For more details, please refer to: https://evalscope.readthedocs.io/en/latest/best_practice/qwen3.html
Power by: EvalScope https://github.com/modelscope/evalscope
- Speed Benchmark
- Benchmark collection (for evaluating abilities such as code、understanding、instruction following、math ...)
NOTE: The result is based on samples of original benchmarks with eval arg
--limit
- Thinking efficiency of Qwen3
- Run Gradio visualization
evalscope app
Get started and have fun ! :)
wangxingjun778
changed discussion status to
closed
wangxingjun778
changed discussion status to
open