ayoubkirouane
/

Mamba-Chat-2.8B

Text Generation

Model card Files Files and versions Community

ayoubkirouane commited on Jan 18, 2024

Commit

0e45b9b

·

verified ·

1 Parent(s): 8bcde2a

Update README.md

Files changed (1) hide show

README.md +13 -1

README.md CHANGED Viewed

@@ -30,7 +30,7 @@ The model's potential is promising, offering faster and more potent language mod
 Introducing Mamba-Chat 🐍, the pioneering chat language model that diverges from the transformer architecture by adopting a state-space model framework.
-Built upon the research by Albert Gu and Tri Dao, specifically their work titled "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" (paper), this model leverages their implementation.
 Mamba-Chat, based on **Mamba-2.8B**, underwent fine-tuning on 16,000 samples from the **HuggingFaceH4/ultrachat_200k** dataset.
@@ -60,6 +60,18 @@ tokenizer.pad_token = tokenizer.eos_token
 tokenizer.chat_template = AutoTokenizer.from_pretrained("HuggingFaceH4/zephyr-7b-beta").chat_template
 model = MambaLMHeadModel.from_pretrained("ayoubkirouane/Mamba-Chat-2.8B", device="cuda", dtype=torch.float16)
 ```

 Introducing Mamba-Chat 🐍, the pioneering chat language model that diverges from the transformer architecture by adopting a state-space model framework.
+Built upon the research by Albert Gu and Tri Dao, specifically their work titled **"Mamba: Linear-Time Sequence Modeling with Selective State Spaces"** (paper), this model leverages their implementation.
 Mamba-Chat, based on **Mamba-2.8B**, underwent fine-tuning on 16,000 samples from the **HuggingFaceH4/ultrachat_200k** dataset.
 tokenizer.chat_template = AutoTokenizer.from_pretrained("HuggingFaceH4/zephyr-7b-beta").chat_template
 model = MambaLMHeadModel.from_pretrained("ayoubkirouane/Mamba-Chat-2.8B", device="cuda", dtype=torch.float16)
+messages = []
+user_message = """
+Write your message here ..
+"""
+messages.append(dict(role="user",content=user_message))
+input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to("cuda")
+out = model.generate(input_ids=input_ids, max_length=2000, temperature=0.9, top_p=0.7, eos_token_id=tokenizer.eos_token_id)
+decoded = tokenizer.batch_decode(out)
+messages.append(dict(role="assistant",content=decoded[0].split("<|assistant|>\n")[-1]))
+print("Model:", decoded[0].split("<|assistant|>\n")[-1])
 ```