File size: 4,171 Bytes
cb5a989 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 |
---
language:
- pt
metrics:
- accuracy
- f1
- pearsonr
base_model:
- Qwen/Qwen2.5-1.5B-Instruct
pipeline_tag: text-generation
library_name: transformers
tags:
- text-generation-inference
license: apache-2.0
---
### Amadeus-Verbo-FI-Qwen2.5-1.5B-PT-BR-Instruct
#### Introduction
Amadeus-Verbo-FI-Qwen2.5-1.5B-PT-BR-Instruct is a Brazilian-Portuguese language model (PT-BR-LLM) developed from the base model Qwen2.5-1.5B-Instruct through fine-tuning, for 2 epochs, with 600k instructions dataset.
Read our article [here](https://www.).
## Details
- **Architecture:** a Transformer-based model with RoPE, SwiGLU, RMSNorm, and Attention QKV bias pre-trained via Causal Language Modeling
- **Parameters:** 1.54B parameters
- **Number of Parameters (Non-Embedding):** 1.31B
- **Number of Layers:** 28
- **Number of Attention Heads (GQA):** 12 for Q and 2 for KV
- **Context length:** 32,768 tokens
- **Number of steps:** 78838
- **Language:** Brazilian Portuguese
#### Usage
You can use Amadeus-Verbo-FI-Qwen2.5-1.5B-PT-BR-Instruct with the latest HuggingFace Transformers library and we advise you to use the latest version of Transformers.
With transformers<4.37.0, you will encounter the following error:
KeyError: 'qwen2'
Below, we have provided a simple example of how to load the model and generate text:
#### Quickstart
The following code snippet uses `pipeline`, `AutoTokenizer`, `AutoModelForCausalLM` and apply_chat_template to show how to load the tokenizer, the model, and how to generate content.
Using the pipeline:
```python
from transformers import pipeline
messages = [
{"role": "user", "content": "Faça uma planilha nutricional para uma alimentação fitness e mediterrânea com todos os dias da semana"},
]
pipe = pipeline("text-generation", model="amadeusai/AV-FI-Qwen2.5-1.5B-PT-BR-Instruct")
pipe(messages)
```
OR
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "amadeusai/AV-FI-Qwen2.5-1.5B-PT-BR-Instruct"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt = "Faça uma planilha nutricional para uma alimentação fitness e mediterrânea com todos os dias da semana."
messages = [
{"role": "system", "content": "Você é um assistente útil."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=512
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
```
OR
```python
from transformers import GenerationConfig, TextGenerationPipeline, AutoTokenizer, AutoModelForCausalLM
import torch
# Specify the model and tokenizer
model_id = "amadeusai/AV-FI-Qwen2.5-1.5B-PT-BR-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
# Specify the generation parameters as you like
generation_config = GenerationConfig(
**{
"do_sample": True,
"max_new_tokens": 512,
"renormalize_logits": True,
"repetition_penalty": 1.2,
"temperature": 0.1,
"top_k": 50,
"top_p": 1.0,
"use_cache": True,
}
)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
generator = TextGenerationPipeline(model=model, task="text-generation", tokenizer=tokenizer, device=device)
# Generate text
prompt = "Faça uma planilha nutricional para uma alimentação fitness e mediterrânea com todos os dias da semana"
completion = generator(prompt, generation_config=generation_config)
print(completion[0]['generated_text'])
```
#### Citation
If you find our work helpful, feel free to cite it.
```
@misc{Amadeus AI,
title = {Amadeus Verbo: A Brazilian Portuguese large language model.},
url = {https://amadeus-ai.com},
author = {Amadeus AI},
month = {November},
year = {2024}
}
``` |