TomoDG commited on
Commit
a044639
·
verified ·
1 Parent(s): de3d10e

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,8 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ EtherealAurora-MN-Nemo-12B-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
37
+ EtherealAurora-MN-Nemo-12B-Q4_K_S.gguf filter=lfs diff=lfs merge=lfs -text
38
+ EtherealAurora-MN-Nemo-12B-Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
39
+ EtherealAurora-MN-Nemo-12B-Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
40
+ EtherealAurora-MN-Nemo-12B-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
EtherealAurora-MN-Nemo-12B-Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bb8aafb8835cbcfe75d39825e1600c6078fd65da7cdac9bae2d9d03c3c6a3b81
3
+ size 7477203872
EtherealAurora-MN-Nemo-12B-Q4_K_S.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6ea9661dee4490ee5209881d9eb061eaab77d4b0921e105ed7a6e0691d8baf1c
3
+ size 7120196512
EtherealAurora-MN-Nemo-12B-Q5_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b4e93653ab9ff3e3747ddc64ad642064a2d1467fbcb667dfada4ac009aaa1cd4
3
+ size 8727630752
EtherealAurora-MN-Nemo-12B-Q6_K.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:53b9410a628c809314efb17c0aaf2ec3f37762a9e157d01f1f85c3022bff3dfb
3
+ size 10056209312
EtherealAurora-MN-Nemo-12B-Q8_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1e4718741ffdc82ae054797a947f4355325a80bd8903f49fd7f2f42307ca312e
3
+ size 13022368672
README.me ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0 # Or the license of the original model
3
+ language: en
4
+ library_name: llama.cpp # Indicate it's primarily for llama.cpp ecosystem
5
+ tags:
6
+ - gguf
7
+ - quantized
8
+ - merge
9
+ - mergekit
10
+ - ties
11
+ - 12b
12
+ - text-generation
13
+ - etherealaurora
14
+ - mn-mag-mell
15
+ - nemomix
16
+ - chat
17
+ - roleplay
18
+ pipeline_tag: text-generation
19
+ base_model: TomoDG/EtherealAurora-MN-Nemo-12B
20
+ model_type: llama # Or the appropriate architecture type
21
+ ---
22
+
23
+ # GGUF Quantized Models for TomoDG/EtherealAurora-MN-Nemo-12B
24
+
25
+ This repository contains GGUF format model files for [TomoDG/EtherealAurora-MN-Nemo-12B](https://huggingface.co/TomoDG/EtherealAurora-MN-Nemo-12B).
26
+
27
+ These files were quantized using [llama.cpp](https://github.com/ggerganov/llama.cpp).
28
+
29
+ ## Original Model Card
30
+
31
+ For details on the merge process, methodology, and intended use, please refer to the original model card:
32
+ [**TomoDG/EtherealAurora-MN-Nemo-12B**](https://huggingface.co/TomoDG/EtherealAurora-MN-Nemo-12B)
33
+
34
+ ## Available Quantizations
35
+
36
+ | File Name | Quantization Type | Size (Approx) | Recommended RAM | Use Case |
37
+ | :------------------------------------------ | :---------------- | :------------ | :-------------- | :------------------------------------------- |
38
+ | `EtherealAurora-MN-Nemo-12B-Q4_K_S.gguf` | Q4_K_S | ~6.95 GB | 9 GB+ | Smallest 4-bit K-quant, lower RAM usage |
39
+ | `EtherealAurora-MN-Nemo-12B-Q4_K_M.gguf` | Q4_K_M | ~7.30 GB | 10 GB+ | Good balance quality/performance, medium RAM |
40
+ | `EtherealAurora-MN-Nemo-12B-Q5_K_M.gguf` | Q5_K_M | ~8.52 GB | 12 GB+ | Higher quality, higher RAM usage |
41
+ | `EtherealAurora-MN-Nemo-12B-Q6_K.gguf` | Q6_K | ~9.82 GB | 13 GB+ | Very high quality, close to FP16 |
42
+ | `EtherealAurora-MN-Nemo-12B-Q8_0.gguf` | Q8_0 | ~12.7 GB | 16 GB+ | Highest quality GGUF quant, large size | |
43
+
44
+ **General Recommendations:**
45
+ * **`_K_M` quants (like Q4_K_M, Q5_K_M):** Generally recommended for a good balance of quality and resource usage.
46
+ * **`Q6_K`:** Offers higher quality closer to FP16 if you have sufficient RAM.
47
+ * **`Q8_0`:** Highest quality GGUF quantization but requires the most resources.
48
+
49
+