Prepare version of SmolLM2 models with MLA (Multihead latent attention)
#9 opened 21 days ago
by
verion1
Release of SFT tuned model
#8 opened 2 months ago
by
yakazimir
Error while trying to run using transformers.js
4
#5 opened 4 months ago
by
hitchhiker3010

Why are model_q4.onnx and model_q4f16.onnx not 4 times smaller than model.onnx?
#3 opened 4 months ago
by
Franck-Dernoncourt