Upload optimized ONNX model w/ GQA
#26
by
Xenova
HF Staff
- opened
No description provided.
Xenova
changed pull request title from
Upload optimized model w/ GQA
to Upload optimized ONNX model w/ GQA
New demo! https://huggingface.co/spaces/HuggingFaceTB/SmolLM2-1.7B-Instruct-WebGPU
Much faster now...
Xenova
changed pull request status to
merged