ayoubkirouane commited on
Commit
0e45b9b
·
verified ·
1 Parent(s): 8bcde2a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -1
README.md CHANGED
@@ -30,7 +30,7 @@ The model's potential is promising, offering faster and more potent language mod
30
 
31
  Introducing Mamba-Chat 🐍, the pioneering chat language model that diverges from the transformer architecture by adopting a state-space model framework.
32
 
33
- Built upon the research by Albert Gu and Tri Dao, specifically their work titled "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" (paper), this model leverages their implementation.
34
 
35
  Mamba-Chat, based on **Mamba-2.8B**, underwent fine-tuning on 16,000 samples from the **HuggingFaceH4/ultrachat_200k** dataset.
36
 
@@ -60,6 +60,18 @@ tokenizer.pad_token = tokenizer.eos_token
60
  tokenizer.chat_template = AutoTokenizer.from_pretrained("HuggingFaceH4/zephyr-7b-beta").chat_template
61
 
62
  model = MambaLMHeadModel.from_pretrained("ayoubkirouane/Mamba-Chat-2.8B", device="cuda", dtype=torch.float16)
 
 
 
 
 
 
 
 
 
 
 
 
63
  ```
64
 
65
 
 
30
 
31
  Introducing Mamba-Chat 🐍, the pioneering chat language model that diverges from the transformer architecture by adopting a state-space model framework.
32
 
33
+ Built upon the research by Albert Gu and Tri Dao, specifically their work titled **"Mamba: Linear-Time Sequence Modeling with Selective State Spaces"** (paper), this model leverages their implementation.
34
 
35
  Mamba-Chat, based on **Mamba-2.8B**, underwent fine-tuning on 16,000 samples from the **HuggingFaceH4/ultrachat_200k** dataset.
36
 
 
60
  tokenizer.chat_template = AutoTokenizer.from_pretrained("HuggingFaceH4/zephyr-7b-beta").chat_template
61
 
62
  model = MambaLMHeadModel.from_pretrained("ayoubkirouane/Mamba-Chat-2.8B", device="cuda", dtype=torch.float16)
63
+ messages = []
64
+ user_message = """
65
+ Write your message here ..
66
+ """
67
+
68
+ messages.append(dict(role="user",content=user_message))
69
+ input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to("cuda")
70
+ out = model.generate(input_ids=input_ids, max_length=2000, temperature=0.9, top_p=0.7, eos_token_id=tokenizer.eos_token_id)
71
+ decoded = tokenizer.batch_decode(out)
72
+ messages.append(dict(role="assistant",content=decoded[0].split("<|assistant|>\n")[-1]))
73
+ print("Model:", decoded[0].split("<|assistant|>\n")[-1])
74
+
75
  ```
76
 
77