Ranjanunicode
/

unicode-llama-2-chat-Hf-q4-gguf

Model card Files Files and versions Community

Ranjanunicode commited on May 3, 2024

Commit

576c997

·

verified ·

1 Parent(s): 4f19d31

Update README.md

Files changed (1) hide show

README.md +12 -2

README.md CHANGED Viewed

@@ -41,11 +41,21 @@ Output Models generate text only.
 - Tuned models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks.
 - To get the expected features and performance for the chat versions, a specific formatting needs to be followed,including the INST and <<SYS>> tags, BOS and EOS tokens, and the whitespaces and breaklines in between (we recommend calling strip() on inputs to avoid double-spaces). See our reference code in github for details: chat_completion.
 - ```
-print("Hi there!")
-  ```
 ### Out-of-Scope Use

 - Tuned models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks.
 - To get the expected features and performance for the chat versions, a specific formatting needs to be followed,including the INST and <<SYS>> tags, BOS and EOS tokens, and the whitespaces and breaklines in between (we recommend calling strip() on inputs to avoid double-spaces). See our reference code in github for details: chat_completion.
+- Just Install ctransformers:
+```
+!pip install ctransformers>=0.2.24
+```
+- Use the following to get started.
 - ```
+from ctransformers import AutoModelForCausalLM
+#Set gpu_layers to the number of layers to offload to GPU. Set to 0 if no GPU acceleration is available on your system.
+llm = AutoModelForCausalLM.from_pretrained("Ranjanunicode/unicode-llama-2-chat-Hf-q4-2", model_file="unicode-llama-2-chat-Hf-q4-2.gguf", model_type="llama", gpu_layers=40)
+print(llm("AI is going to"))
+  ```
 ### Out-of-Scope Use