YaTharThShaRma999
/

orpheus_awq

4-bit precision

Model card Files Files and versions

YaTharThShaRma999 commited on 11 days ago

Commit

70e2547

·

verified ·

1 Parent(s): f84b6bd

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -20,7 +20,7 @@ tp = 1 ## change if you have multiple gpus
 cache_max_entry_count = 0.2 ## how much vram is reserved for context
 engine_config = TurbomindEngineConfig(model_format='awq', dtype='float16', cache_max_entry_count=cache_max_entry_count, tp=tp, quant_policy=8)
-pipe = pipeline("heydryft/Orpheus-3b-FT-AWQ", backend_config=engine_config)
 tokeniser = AutoTokenizer.from_pretrained("unsloth/orpheus-3b-0.1-ft-unsloth-bnb-4bit")
 snac_model = SNAC.from_pretrained("hubertsiuzdak/snac_24khz").to('cuda:0')
 ```

 cache_max_entry_count = 0.2 ## how much vram is reserved for context
 engine_config = TurbomindEngineConfig(model_format='awq', dtype='float16', cache_max_entry_count=cache_max_entry_count, tp=tp, quant_policy=8)
+pipe = pipeline("YaTharThShaRma999/orpheus_awq", backend_config=engine_config)
 tokeniser = AutoTokenizer.from_pretrained("unsloth/orpheus-3b-0.1-ft-unsloth-bnb-4bit")
 snac_model = SNAC.from_pretrained("hubertsiuzdak/snac_24khz").to('cuda:0')
 ```