Logo

# 🚀 demo-GGUF ### Optimized quantized models for efficient inference
## 📋 Overview A collection of optimized GGUF quantized models derived from [demo](https://huggingface.co/BlossomsAI/BloomVN-0.5B-ppo), providing various performance-quality tradeoffs.
## 💎 Model Variants | Variant | Use Case | Download | |---------|-----------|------------| | demo_model_int8 | For mobile and embedded applications where memory and computational resources are limited, the int8 variant provides a good balance between accuracy and performance. | [📥](https://huggingface.co/Vuanhngo11/demo-gguf/resolve/main/demo_model_int8.gguf) | demo_model_int16 | For applications that require higher accuracy and can afford more computational resources, the int16 variant offers improved performance without significant memory overhead. | [📥](https://huggingface.co/Vuanhngo11/demo-gguf/resolve/main/demo_model_int16.gguf) | demo_model_fp16 | For high-performance computing applications where precision is crucial, the fp16 variant provides the best accuracy and is suitable for desktop and server environments. | [📥](https://huggingface.co/Vuanhngo11/demo-gguf/resolve/main/demo_model_fp16.gguf) ## 🤝 Contributors Developed with ❤️ by [BlossomAI](https://huggingface.co/BlossomsAI) ---
Star ⭐️ this repo if you find it valuable!