Vuanhngo11
/

demo-gguf

Model card Files Files and versions Community

Vuanhngo11 commited on Mar 13

Commit

1336e93

verified ·

1 Parent(s): f2e95d9

Upload folder using huggingface_hub

Browse files

Files changed (1) hide show

README.md +35 -0

README.md ADDED Viewed

	@@ -0,0 +1,35 @@

+<div align="center">
+<img src="https://github.com/bloomifycafe/blossomsAI/blob/main/assets/logo.png?raw=true" alt="Logo"/>
+</div>
+</br>
+<div align="center">
+# 🚀 demo-GGUF
+### Optimized quantized models for efficient inference
+</div>
+## 📋 Overview
+A collection of optimized GGUF quantized models derived from [demo](https://huggingface.co/BlossomsAI/BloomVN-0.5B-ppo), providing various performance-quality tradeoffs.
+<div style="width: 100%; text-align: left; margin-left: 0;">
+## 💎 Model Variants
+| Variant | Use Case | Download |
+|---------|-----------|------------|
+| demo_model_int8 | For mobile and embedded applications where memory and computational resources are limited, the int8 variant provides a good balance between accuracy and performance. | [📥](https://huggingface.co/Vuanhngo11/demo-gguf/resolve/main/demo_model_int8.gguf)
+| demo_model_int16 | For applications that require higher accuracy and can afford more computational resources, the int16 variant offers improved performance without significant memory overhead. | [📥](https://huggingface.co/Vuanhngo11/demo-gguf/resolve/main/demo_model_int16.gguf)
+| demo_model_fp16 | For high-performance computing applications where precision is crucial, the fp16 variant provides the best accuracy and is suitable for desktop and server environments. | [📥](https://huggingface.co/Vuanhngo11/demo-gguf/resolve/main/demo_model_fp16.gguf)
+## 🤝 Contributors
+Developed with ❤️ by [BlossomAI](https://huggingface.co/BlossomsAI)
+---
+<div align="center">
+<sub>Star ⭐️ this repo if you find it valuable!</sub>
+</div>