TomoDG commited on
Commit
e06c5a5
·
verified ·
1 Parent(s): d94ab86

Delete README.me

Browse files
Files changed (1) hide show
  1. README.me +0 -49
README.me DELETED
@@ -1,49 +0,0 @@
1
- ---
2
- license: apache-2.0 # Or the license of the original model
3
- language: en
4
- library_name: llama.cpp # Indicate it's primarily for llama.cpp ecosystem
5
- tags:
6
- - gguf
7
- - quantized
8
- - merge
9
- - mergekit
10
- - ties
11
- - 12b
12
- - text-generation
13
- - etherealaurora
14
- - mn-mag-mell
15
- - nemomix
16
- - chat
17
- - roleplay
18
- pipeline_tag: text-generation
19
- base_model: TomoDG/EtherealAurora-MN-Nemo-12B
20
- model_type: llama
21
- ---
22
-
23
- # GGUF Quantized Models for TomoDG/EtherealAurora-MN-Nemo-12B
24
-
25
- This repository contains GGUF format model files for [TomoDG/EtherealAurora-MN-Nemo-12B](https://huggingface.co/TomoDG/EtherealAurora-MN-Nemo-12B).
26
-
27
- These files were quantized using [llama.cpp](https://github.com/ggerganov/llama.cpp).
28
-
29
- ## Original Model Card
30
-
31
- For details on the merge process, methodology, and intended use, please refer to the original model card:
32
- [**TomoDG/EtherealAurora-MN-Nemo-12B**](https://huggingface.co/TomoDG/EtherealAurora-MN-Nemo-12B)
33
-
34
- ## Available Quantizations
35
-
36
- | File Name | Quantization Type | Size (Approx) | Recommended RAM | Use Case |
37
- | :------------------------------------------ | :---------------- | :------------ | :-------------- | :------------------------------------------- |
38
- | `EtherealAurora-MN-Nemo-12B-Q4_K_S.gguf` | Q4_K_S | ~6.95 GB | 9 GB+ | Smallest 4-bit K-quant, lower RAM usage |
39
- | `EtherealAurora-MN-Nemo-12B-Q4_K_M.gguf` | Q4_K_M | ~7.30 GB | 10 GB+ | Good balance quality/performance, medium RAM |
40
- | `EtherealAurora-MN-Nemo-12B-Q5_K_M.gguf` | Q5_K_M | ~8.52 GB | 12 GB+ | Higher quality, higher RAM usage |
41
- | `EtherealAurora-MN-Nemo-12B-Q6_K.gguf` | Q6_K | ~9.82 GB | 13 GB+ | Very high quality, close to FP16 |
42
- | `EtherealAurora-MN-Nemo-12B-Q8_0.gguf` | Q8_0 | ~12.7 GB | 16 GB+ | Highest quality GGUF quant, large size | |
43
-
44
- **General Recommendations:**
45
- * **`_K_M` quants (like Q4_K_M, Q5_K_M):** Generally recommended for a good balance of quality and resource usage.
46
- * **`Q6_K`:** Offers higher quality closer to FP16 if you have sufficient RAM.
47
- * **`Q8_0`:** Highest quality GGUF quantization but requires the most resources.
48
-
49
-