BramVanroy
/

Llama-2-13b-chat-dutch

Text Generation

text-generation-inference

Model card Files Files and versions Community

BramVanroy commited on Dec 4, 2023

Commit

290ecb9

·

1 Parent(s): 5f0b6c1

Update README.md

Files changed (1) hide show

README.md +21 -0

README.md CHANGED Viewed

@@ -38,6 +38,27 @@ Bram Vanroy. (2023). Llama v2 13b: Finetuned on Dutch Conversational Data. Huggi
 }
 ```
 ## Model description
 I could not get the original Llama 2 13B to produce much Dutch, even though the description paper indicates that it was trained on a (small) portion of Dutch data. I therefore

 }
 ```
+## Usage
+```python
+from transformers import pipeline
+# If you want to add a system message, add a dictionary with role "system". However, this will likely have little
+# effect since the model was only finetuned using a single system message.
+messages = [{"role": "user", "content": "Welke talen worden er in België gesproken?"}]
+pipe = pipe = pipeline("text-generation", model="BramVanroy/Llama-2-13b-chat-dutch", device_map="auto")
+# Just apply the template but leave the tokenization for the pipeline to do
+prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False)
+# Only return the newly generated tokens, not prompt+new_tokens (return_full_text=False)
+generated = pipe(prompt, do_sample=True, max_new_tokens=128, return_full_text=False)
+generated[0]["generated_text"]
+# ' De officiële talen van België zijn Nederlands, Frans en Duits. Daarnaast worden er nog een aantal andere talen gesproken, waaronder Engels, Spaans, Italiaans, Portugees, Turks, Arabisch en veel meer. '
+```
 ## Model description
 I could not get the original Llama 2 13B to produce much Dutch, even though the description paper indicates that it was trained on a (small) portion of Dutch data. I therefore