Are there plans to release a dynamic quantitative version of the distillation model?

#48
by CoiaPrant - opened

Even 131G is unlikely for a regular graphics card.
Is there any plan to release a dynamic quantitative version of the distillation model?

Curious how much vRAM is needed after 70 B's distillation model is dynamically quantized

Sign up or log in to comment