File size: 2,276 Bytes
9452fb4
 
 
 
 
 
 
 
 
 
 
a934c89
 
9452fb4
a934c89
 
25f182b
9452fb4
a934c89
 
9452fb4
 
 
a934c89
 
 
 
 
9452fb4
ae6733c
 
a934c89
25f182b
a934c89
9452fb4
ae6733c
 
 
a85ddc9
ae6733c
a934c89
 
9452fb4
25f182b
 
a934c89
9452fb4
a934c89
9452fb4
 
ff5b1ec
9452fb4
 
 
ff5b1ec
a934c89
 
 
 
 
 
 
ff5b1ec
9452fb4
 
 
a934c89
9452fb4
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
---
library_name: transformers
tags: []
---

# Model Card for Model ID

## Model Details

### Model Description

RoQLlama is a new lightweight Romanian language-adapted LLM with 7 billion parameters and quantized to 4 bits by employing the state-of-the-art quantized
LoRA (QLoRA) training technique.

- **Language:** Romanian
- **License:** Llama2 Community License Agreement
- **Finetuned from model:** Meta's Llama2 7B

### Model Sources
- **Paper:** https://arxiv.org/abs/2410.04269

## How to Get Started with the Model

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
MODEL_NAME = "andreidima/Llama-2-7b-Romanian-qlora"
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, device_map="auto")
model = AutoModelForCausalLM.from_pretrained(MODEL_NAME, device_map="auto")

input_text = """Eu răspund la întrebări pe baza contextului.
Context: În anul 1600, Mihai Viteazul a realizat prima unire a Țărilor Române: Țara Românească, Transilvania și Moldova. Această unire a fost un moment important în istoria României.
Întrebare: În ce an a realizat Mihai Viteazul prima unire a Țărilor Române?
Răspuns: """
input_ids = tokenizer(input_text, return_tensors="pt")

outputs = model.generate(
    **input_ids, 
    max_new_tokens=100, 
    eos_token_id=[13] # 13 is the token ID for a newline character at the end of a non-empty line
)
print(tokenizer.decode(outputs[0]))
```

Note: Adding a space at the end of the prompt has been observed to significantly improve the model's output quality.

## Training Details and Evaluation

Please refer to the paper for details on the model's training and evaluation.


## Citation

**BibTeX:**

```
@inproceedings{dima2024roqllama,
      title={RoQLlama: A Lightweight Romanian Adapted Language Model}, 
      author={George-Andrei Dima and Andrei-Marius Avram and Cristian-George Crăciun and Dumitru-Clementin Cercel},
      booktitle={Findings of the Association for Computational Linguistics: EMNLP 2024}
      year={2024},
      url={https://arxiv.org/abs/2410.04269}, 
}
```

**APA:**

Dima, G. A., Avram, A. M., Crăciun, C. G., & Cercel, D. C. (2024). RoQLlama: A lightweight Romanian adapted language model. In _Findings of the Association for Computational Linguistics: EMNLP 2024_.