aashish1904 commited on
Commit
45f79d2
·
verified ·
1 Parent(s): cab35f4

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +163 -0
README.md ADDED
@@ -0,0 +1,163 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+
4
+ license: cc-by-nc-sa-4.0
5
+ language:
6
+ - multilingual
7
+ - fa
8
+ - en
9
+ library_name: transformers
10
+ tags:
11
+ - text-generation-inference
12
+ inference: false
13
+ metrics:
14
+ - bleu
15
+ - comet
16
+ - accuracy
17
+ - perplexity
18
+ - spearmanr
19
+ pipeline_tag: text-generation
20
+ co2_eq_emissions:
21
+ emissions: 232380
22
+ source: "PersianMind: A Cross-Lingual Persian-English Large Language Model. https://arxiv.org/abs/2401.06466"
23
+ training_type: "fine-tuning"
24
+ hardware_used: "4 RTX3090 24GB GPUs"
25
+ geographical_location: "Tehran, Iran"
26
+
27
+ ---
28
+
29
+ [![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory)
30
+
31
+
32
+ # QuantFactory/PersianMind-v1.0-GGUF
33
+ This is quantized version of [universitytehran/PersianMind-v1.0](https://huggingface.co/universitytehran/PersianMind-v1.0) created using llama.cpp
34
+
35
+ # Original Model Card
36
+
37
+
38
+ <p align="center">
39
+ <img src="PersianMind.jpg" alt="PersianMind logo" width=200/>
40
+ </p>
41
+
42
+
43
+ # <span style="font-variant:small-caps;">PersianMind</span>
44
+
45
+ <span style="font-variant:small-caps;">PersianMind</span> is a cross-lingual Persian-English large language model.
46
+ The model achieves state-of-the-art results on Persian subset of the [<span style="font-variant:small-caps;">Belebele</span>](https://github.com/facebookresearch/belebele) benchmark
47
+ and the [ParsiNLU multiple-choice QA](https://github.com/persiannlp/parsinlu) task.
48
+ It also attains performance comparable to GPT-3.5-turbo in a Persian reading comprehension task.
49
+
50
+ ## Model Description
51
+
52
+ - **Developed by:** [Pedram Rostami](mailto:[email protected]), [Ali Salemi](mailto:[email protected]), and [Mohammad Javad Dousti](mailto:[email protected])
53
+ - **Model type:** Language model
54
+ - **Languages:** English and Persian
55
+ - **License:** [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/) (non-commercial use only.)
56
+
57
+ ## How to Get Started with the Model
58
+
59
+ Use the code below to get started with the model.
60
+ Note that you need to install <code><b>sentencepiece</b></code> and <code><b>accelerate</b></code> libraries along with <code><b>PyTorch</b></code> and <code><b>🤗Transformers</b></code> to run this code.
61
+
62
+ ```python
63
+ from transformers import AutoTokenizer, AutoModelForCausalLM
64
+ import torch
65
+
66
+ device = "cuda" if torch.cuda.is_available() else "cpu"
67
+ model = AutoModelForCausalLM.from_pretrained(
68
+ "universitytehran/PersianMind-v1.0",
69
+ torch_dtype=torch.bfloat16,
70
+ low_cpu_mem_usage=True,
71
+ device_map={"": device},
72
+ )
73
+ tokenizer = AutoTokenizer.from_pretrained(
74
+ "universitytehran/PersianMind-v1.0",
75
+ )
76
+
77
+ TEMPLATE = "{context}\nYou: {prompt}\nPersianMind: "
78
+ CONTEXT = "This is a conversation with PersianMind. It is an artificial intelligence model designed by a team of " \
79
+ "NLP experts at the University of Tehran to help you with various tasks such as answering questions, " \
80
+ "providing recommendations, and helping with decision making. You can ask it anything you want and " \
81
+ "it will do its best to give you accurate and relevant information."
82
+ PROMPT = "در مورد هوش مصنوعی توضیح بده."
83
+
84
+ model_input = TEMPLATE.format(context=CONTEXT, prompt=PROMPT)
85
+ input_tokens = tokenizer(model_input, return_tensors="pt")
86
+ input_tokens = input_tokens.to(device)
87
+ generate_ids = model.generate(**input_tokens, max_new_tokens=512, do_sample=False, repetition_penalty=1.1)
88
+ model_output = tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
89
+
90
+ print(model_output[len(model_input):])
91
+ ```
92
+
93
+ ### How to Quantize the Model
94
+
95
+ Quantized models can be run on resource-constrained devices.
96
+ To quantize the model, you should install the <code><b>bitsandbytes</b></code> library.
97
+ In order to quantize the model in 8-bit (`INT8`), use the code below.
98
+
99
+ ```python
100
+ model = AutoModelForCausalLM.from_pretrained(
101
+ "universitytehran/PersianMind-v1.0",
102
+ device_map="auto",
103
+ low_cpu_mem_usage=True,
104
+ load_in_8bit=True
105
+ )
106
+ ```
107
+
108
+ Alternatively, you can quantize the model in 4-bit (`NormalFloat4`) with the following code.
109
+
110
+ ```python
111
+ from transformers import BitsAndBytesConfig
112
+
113
+ quantization_config = BitsAndBytesConfig(
114
+ load_in_4bit=True,
115
+ bnb_4bit_use_double_quant=True,
116
+ bnb_4bit_quant_type="nf4",
117
+ )
118
+ model = AutoModelForCausalLM.from_pretrained(
119
+ "universitytehran/PersianMind-v1.0",
120
+ quantization_config=quantization_config,
121
+ device_map="auto"
122
+ )
123
+ ```
124
+
125
+ ### Evaluating Quantized Models
126
+
127
+ | Model | <span style="font-variant:small-caps;">Belebele</span> (Persian) | Fa→En Translation<br>(<span style="font-variant:small-caps;">Comet</span>) | En→Fa Translation<br>(<span style="font-variant:small-caps;">Comet</span>) | Model Size | Tokens/sec |
128
+ | :----------------------------------------------------------------: | :--------------------------------------------------------------: | :------------------------------------------------------------------------: | :------------------------------------------------------------------------: | :--------: | :--------: |
129
+ | <span style="font-variant:small-caps;">PersianMind</span> (`BF16`) | 73.9 | 83.61 | 79.44 | 13.7G | 25.35 |
130
+ | <span style="font-variant:small-caps;">PersianMind</span> (`INT8`) | 73.7 | 82.32 | 78.61 | 7.2G | 11.36 |
131
+ | <span style="font-variant:small-caps;">PersianMind</span> (`NF4`) | 70.2 | 82.07 | 80.36 | 3.9G | 24.36 |
132
+
133
+ We evaluated quantized models in various tasks against the original model.
134
+ Specifically, we evaluated all models using the reading comprehension multiple-choice
135
+ question-answering benchmark of [<span style="font-variant:small-caps;">Belebele</span>](https://github.com/facebookresearch/belebele) (Persian subset) and reported the accuracy of each model.
136
+ Additionally, we evaluated our models for Persian-to-English and English-to-Persian translation tasks.
137
+ For this, we utilized the Persian-English subset of the [<span style="font-variant:small-caps;">Flores</span>-200](https://github.com/facebookresearch/flores/tree/main/flores200) dataset and
138
+ reported our results using the <span style="font-variant:small-caps;">Comet</span> metric.
139
+ Furthermore, we calculated the average number of generated tokens per second by each model during running the translation tasks.
140
+ To understand resource efficiency, we measured the memory usage of each model by employing the `get_memory_footprint()` function.
141
+
142
+ ## License
143
+ <span style="font-variant:small-caps;">PersianMind</span> is subject to Meta's [LLaMa2 Community License](https://raw.githubusercontent.com/facebookresearch/llama/main/LICENSE).
144
+ It is further licensed under [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/), which allows non-commercial use of the model.
145
+ Commercial use of this model requires written agreement which must be obtained from the copyright holders who are listed as developers in this page.
146
+ If you suspect any violations, please reach out to us.
147
+
148
+
149
+ ## Citation
150
+
151
+ If you find this model helpful, please ensure to cite the following paper.
152
+
153
+ **BibTeX:**
154
+ ```bibtex
155
+ @misc{persianmind,
156
+ title={{PersianMind: A Cross-Lingual Persian-English Large Language Model}},
157
+ author={Rostami, Pedram and Salemi, Ali and Dousti, Mohammad Javad},
158
+ year={2024}
159
+ eprint={2401.06466},
160
+ archivePrefix={arXiv},
161
+ primaryClass={cs.CL}
162
+ }
163
+ ```