Vuanhngo11
/

bloomvn-0.5b-ppo-gguf

Model card Files Files and versions Community

YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

🚀 bloomvn-0.5b-ppo-GGUF

Optimized quantized models for efficient inference

📋 Overview

A collection of optimized GGUF quantized models derived from bloomvn-0.5b-ppo, providing various performance-quality tradeoffs.

💎 Model Variants

Variant	Use Case	Download
base	The base model is suitable for applications where model size is not a concern, and high accuracy is required. It can be used for tasks such as text generation, language translation, and text summarization. This model is in FP16 format, providing a good balance between size and performance.	📥
q2_k	The q2_k variant is ideal for extremely constrained environments where model size is a significant concern, such as embedded systems or low-end mobile devices. Although it is highly compressed, it still maintains a reasonable level of accuracy, making it suitable for simple language tasks. This 2-bit quantized model is a good choice when storage space is limited.	📥
q3_k_m	The q3_k_m variant is designed for memory-limited devices that require a balance between model size and accuracy. This 3-bit quantized model is very compressed, making it suitable for mid-range mobile devices or systems with limited storage capacity. It is a good choice for applications where a moderate level of accuracy is required, such as language understanding or text classification.	📥

🤝 Contributors

Developed with ❤️ by BlossomAI

_{Star ⭐️ this repo if you find it valuable!}

Downloads last month: 86

GGUF

Model size

494M params

Architecture

qwen2

Hardware compatibility

Log In to view the estimation

2-bit

3-bit

View +1 variant

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support