thesven
/

Chatty-McChatterson-3-mini-128k

Text Generation

text-generation-inference

Model card Files Files and versions Community

Chatty-McChatterson-3-mini-128k / README.md

thesven's picture

Update README.md

8487d56 verified 12 months ago

|

history blame contribute delete

2.51 kB

	---
	license: mit
	datasets:
	- Replete-AI/code_bagel
	---
	# Chatty-McChatterson-3-mini-128k

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/6324ce4d5d0cf5c62c6e3c5a/zKJXnm52nly4viTzs0Ysa.png)

	## Model Details

	Model Name: Chatty-McChatterson-3-mini-128k
	Base Model: [microsoft/Phi-3-mini-128k-instruct](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct)
	Fine-tuning Method: Supervised Fine-Tuning (SFT)
	Dataset: [ultrachat_200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k)
	Training Data: 12884 conversations selected for being 512 input tokens or less
	Training Duration: 4 hours
	Hardware: Nvidia RTX A4500
	Epochs: 3

	## Training Procedure

	This model was fine-tuned to provide better instructions on code.

	The training was conducted using PEFT and SFTTrainer on select conversations from the Ultra Chat 200k dataset.
	Training was completed in 3 epochs (19326 steps) over a span of 4 hours on an Nvidia A4500 GPU.

	The dataset comprised of a filterd list of rows from the [Ultra Chat 200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k) dataset, where the prompt template was 512 tokens or less.

	## Intended Use

	This model is designed to improve the overall chat experience and response quality.

	## Getting Started

	## Instruct Template
	```bash
	<\|system\|>
	{system_message} <\|end\|>
	<\|user\|>
	{Prompt) <\|end\|>
	<\|assistant\|>
	```

	### Transfromers

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

	model_name_or_path = "thesven/Chatty-McChatterson-3-mini-128k"

	# BitsAndBytesConfig for loading the model in 4-bit precision
	bnb_config = BitsAndBytesConfig(
	load_in_4bit=True,
	bnb_4bit_quant_type="nf4",
	bnb_4bit_compute_dtype="float16",
	)

	tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=True)
	model = AutoModelForCausalLM.from_pretrained(
	model_name_or_path,
	device_map="auto",
	trust_remote_code=False,
	revision="main",
	quantization_config=bnb_config
	)
	model.pad_token = model.config.eos_token_id

	prompt_template = '''
	<\|user\|>
	What is the name of the big tower in Toronto?.<\|end\|>
	<\|assistant\|>
	'''

	input_ids = tokenizer(prompt_template, return_tensors='pt').input_ids.cuda()
	output = model.generate(inputs=input_ids, temperature=0.1, do_sample=True, top_p=0.95, top_k=40, max_new_tokens=256)

	generated_text = tokenizer.decode(output[0, len(input_ids[0]):], skip_special_tokens=True)
	print(generated_text)
	```