---
license: apache-2.0 # Or the license of the original model
language: en
library_name: llama.cpp # Indicate it's primarily for llama.cpp ecosystem
tags:
- gguf
- quantized
- merge
- mergekit
- ties
- 12b
- text-generation
- etherealaurora
- mn-mag-mell
- nemomix
- chat
- roleplay
pipeline_tag: text-generation
base_model: TomoDG/EtherealAurora-MN-Nemo-12B
model_type: llama 
---

# GGUF Quantized Models for TomoDG/EtherealAurora-MN-Nemo-12B

This repository contains GGUF format model files for [TomoDG/EtherealAurora-MN-Nemo-12B](https://huggingface.co/TomoDG/EtherealAurora-MN-Nemo-12B).

These files were quantized using [llama.cpp](https://github.com/ggerganov/llama.cpp).

## Original Model Card

For details on the merge process, methodology, and intended use, please refer to the original model card:
[**TomoDG/EtherealAurora-MN-Nemo-12B**](https://huggingface.co/TomoDG/EtherealAurora-MN-Nemo-12B)

## Available Quantizations

| File Name                                   | Quantization Type | Size (Approx) | Recommended RAM | Use Case                                     |
| :------------------------------------------ | :---------------- | :------------ | :-------------- | :------------------------------------------- |
| `EtherealAurora-MN-Nemo-12B-Q4_K_S.gguf`    | Q4_K_S            | ~6.95 GB      | 9 GB+           | Smallest 4-bit K-quant, lower RAM usage      |
| `EtherealAurora-MN-Nemo-12B-Q4_K_M.gguf`    | Q4_K_M            | ~7.30 GB      | 10 GB+          | Good balance quality/performance, medium RAM |
| `EtherealAurora-MN-Nemo-12B-Q5_K_M.gguf`    | Q5_K_M            | ~8.52 GB      | 12 GB+          | Higher quality, higher RAM usage           |
| `EtherealAurora-MN-Nemo-12B-Q6_K.gguf`      | Q6_K              | ~9.82 GB      | 13 GB+          | Very high quality, close to FP16             |
| `EtherealAurora-MN-Nemo-12B-Q8_0.gguf`      | Q8_0              | ~12.7 GB      | 16 GB+          | Highest quality GGUF quant, large size       |                              |

**General Recommendations:**
* **`_K_M` quants (like Q4_K_M, Q5_K_M):** Generally recommended for a good balance of quality and resource usage.
* **`Q6_K`:** Offers higher quality closer to FP16 if you have sufficient RAM.
* **`Q8_0`:** Highest quality GGUF quantization but requires the most resources.