demo-gguf / README.md
Vuanhngo11's picture
Upload folder using huggingface_hub
1336e93 verified
Logo

πŸš€ demo-GGUF

Optimized quantized models for efficient inference

πŸ“‹ Overview

A collection of optimized GGUF quantized models derived from demo, providing various performance-quality tradeoffs.

πŸ’Ž Model Variants

Variant Use Case Download
demo_model_int8 For mobile and embedded applications where memory and computational resources are limited, the int8 variant provides a good balance between accuracy and performance. πŸ“₯
demo_model_int16 For applications that require higher accuracy and can afford more computational resources, the int16 variant offers improved performance without significant memory overhead. πŸ“₯
demo_model_fp16 For high-performance computing applications where precision is crucial, the fp16 variant provides the best accuracy and is suitable for desktop and server environments. πŸ“₯

🀝 Contributors

Developed with ❀️ by BlossomAI


Star ⭐️ this repo if you find it valuable!