DavidAU commited on
Commit
6c63425
·
verified ·
1 Parent(s): c9b1216

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -40,9 +40,9 @@ Use Jinja Template or CHATML template.
40
 
41
  IMPORTANT NOTES:
42
 
43
- - Due to the unique nature (MOE, Size, Activated experts) of this model GGUF quants can be run on the CPU, GPU or with GPU part "off-load", right up to full precision.
44
- - This model is difficult to Imatrix : You need a much larger imatrix file / multi-language / multi-content to imatrix it.
45
- - GPU speeds will be BLISTERING 4x-8x or higher than CPU only AND relative to other "30B" models (equal roughly to 7.5B "normal" model speeds).
46
 
47
  Please refer the org model card for details, benchmarks, how to use, settings, system roles etc etc :
48
 
 
40
 
41
  IMPORTANT NOTES:
42
 
43
+ - Due to the unique nature (MOE, Size, Activated experts, size of experts) of this model GGUF quants can be run on the CPU, GPU or with GPU part "off-load", right up to full precision.
44
+ - This model is difficult to Imatrix : You need a much larger imatrix file / multi-language / multi-content (ie code/text) to imatrix it.
45
+ - GPU speeds will be BLISTERING 4x-8x or higher than CPU only speeds AND this model will be BLISTERING too, relative to other "30B" models (Token per second speed equal roughly to 7.5B "normal" model speeds).
46
 
47
  Please refer the org model card for details, benchmarks, how to use, settings, system roles etc etc :
48