YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Logo

πŸš€ bloomvn-0.5b-ppo-GGUF

Optimized quantized models for efficient inference

πŸ“‹ Overview

A collection of optimized GGUF quantized models derived from bloomvn-0.5b-ppo, providing various performance-quality tradeoffs.

πŸ’Ž Model Variants

Variant Use Case Download
base The base model is suitable for applications where model size is not a concern, and high accuracy is required. It can be used for tasks such as text generation, language translation, and text summarization. This model is in FP16 format, providing a good balance between size and performance. πŸ“₯
q2_k The q2_k variant is ideal for extremely constrained environments where model size is a significant concern, such as embedded systems or low-end mobile devices. Although it is highly compressed, it still maintains a reasonable level of accuracy, making it suitable for simple language tasks. This 2-bit quantized model is a good choice when storage space is limited. πŸ“₯
q3_k_m The q3_k_m variant is designed for memory-limited devices that require a balance between model size and accuracy. This 3-bit quantized model is very compressed, making it suitable for mid-range mobile devices or systems with limited storage capacity. It is a good choice for applications where a moderate level of accuracy is required, such as language understanding or text classification. πŸ“₯

🀝 Contributors

Developed with ❀️ by BlossomAI


Star ⭐️ this repo if you find it valuable!
Downloads last month
86
GGUF
Model size
494M params
Architecture
qwen2
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support