Text Generation
Safetensors
Portuguese
qwen2
conversational
Eval Results
leaderboard-pt-pr-bot commited on
Commit
517a5ac
·
verified ·
1 Parent(s): dba8e2a

Adding the Open Portuguese LLM Leaderboard Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard

The purpose of this PR is to add evaluation results from the [🚀 Open Portuguese LLM Leaderboard](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard) to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard/discussions

Files changed (1) hide show
  1. README.md +167 -1
README.md CHANGED
@@ -9,6 +9,153 @@ datasets:
9
  - adalbertojunior/openHermes_portuguese
10
  - cnmoro/smoltalk-555k-ptbr
11
  - cnmoro/RagMixPTBR-Legal-Alpaca-2M
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  ---
13
 
14
  Qwen2.5-0.5B finetuned for proficiency in Portuguese language and increased intelligence.
@@ -223,4 +370,23 @@ response
223
  * **Max Length:** 512
224
  * **Max Context Length** 480
225
  * **Max Generation Tokens:** 32
226
- * **Effective Batch Size:** 1.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  - adalbertojunior/openHermes_portuguese
10
  - cnmoro/smoltalk-555k-ptbr
11
  - cnmoro/RagMixPTBR-Legal-Alpaca-2M
12
+ model-index:
13
+ - name: Qwen2.5-0.5B-Portuguese-v1
14
+ results:
15
+ - task:
16
+ type: text-generation
17
+ name: Text Generation
18
+ dataset:
19
+ name: ENEM Challenge (No Images)
20
+ type: eduagarcia/enem_challenge
21
+ split: train
22
+ args:
23
+ num_few_shot: 3
24
+ metrics:
25
+ - type: acc
26
+ value: 37.86
27
+ name: accuracy
28
+ source:
29
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=cnmoro/Qwen2.5-0.5B-Portuguese-v1
30
+ name: Open Portuguese LLM Leaderboard
31
+ - task:
32
+ type: text-generation
33
+ name: Text Generation
34
+ dataset:
35
+ name: BLUEX (No Images)
36
+ type: eduagarcia-temp/BLUEX_without_images
37
+ split: train
38
+ args:
39
+ num_few_shot: 3
40
+ metrics:
41
+ - type: acc
42
+ value: 34.63
43
+ name: accuracy
44
+ source:
45
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=cnmoro/Qwen2.5-0.5B-Portuguese-v1
46
+ name: Open Portuguese LLM Leaderboard
47
+ - task:
48
+ type: text-generation
49
+ name: Text Generation
50
+ dataset:
51
+ name: OAB Exams
52
+ type: eduagarcia/oab_exams
53
+ split: train
54
+ args:
55
+ num_few_shot: 3
56
+ metrics:
57
+ - type: acc
58
+ value: 33.12
59
+ name: accuracy
60
+ source:
61
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=cnmoro/Qwen2.5-0.5B-Portuguese-v1
62
+ name: Open Portuguese LLM Leaderboard
63
+ - task:
64
+ type: text-generation
65
+ name: Text Generation
66
+ dataset:
67
+ name: Assin2 RTE
68
+ type: assin2
69
+ split: test
70
+ args:
71
+ num_few_shot: 15
72
+ metrics:
73
+ - type: f1_macro
74
+ value: 86.3
75
+ name: f1-macro
76
+ source:
77
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=cnmoro/Qwen2.5-0.5B-Portuguese-v1
78
+ name: Open Portuguese LLM Leaderboard
79
+ - task:
80
+ type: text-generation
81
+ name: Text Generation
82
+ dataset:
83
+ name: Assin2 STS
84
+ type: eduagarcia/portuguese_benchmark
85
+ split: test
86
+ args:
87
+ num_few_shot: 15
88
+ metrics:
89
+ - type: pearson
90
+ value: 54.3
91
+ name: pearson
92
+ source:
93
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=cnmoro/Qwen2.5-0.5B-Portuguese-v1
94
+ name: Open Portuguese LLM Leaderboard
95
+ - task:
96
+ type: text-generation
97
+ name: Text Generation
98
+ dataset:
99
+ name: FaQuAD NLI
100
+ type: ruanchaves/faquad-nli
101
+ split: test
102
+ args:
103
+ num_few_shot: 15
104
+ metrics:
105
+ - type: f1_macro
106
+ value: 65.33
107
+ name: f1-macro
108
+ source:
109
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=cnmoro/Qwen2.5-0.5B-Portuguese-v1
110
+ name: Open Portuguese LLM Leaderboard
111
+ - task:
112
+ type: text-generation
113
+ name: Text Generation
114
+ dataset:
115
+ name: HateBR Binary
116
+ type: ruanchaves/hatebr
117
+ split: test
118
+ args:
119
+ num_few_shot: 25
120
+ metrics:
121
+ - type: f1_macro
122
+ value: 44.06
123
+ name: f1-macro
124
+ source:
125
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=cnmoro/Qwen2.5-0.5B-Portuguese-v1
126
+ name: Open Portuguese LLM Leaderboard
127
+ - task:
128
+ type: text-generation
129
+ name: Text Generation
130
+ dataset:
131
+ name: PT Hate Speech Binary
132
+ type: hate_speech_portuguese
133
+ split: test
134
+ args:
135
+ num_few_shot: 25
136
+ metrics:
137
+ - type: f1_macro
138
+ value: 55.1
139
+ name: f1-macro
140
+ source:
141
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=cnmoro/Qwen2.5-0.5B-Portuguese-v1
142
+ name: Open Portuguese LLM Leaderboard
143
+ - task:
144
+ type: text-generation
145
+ name: Text Generation
146
+ dataset:
147
+ name: tweetSentBR
148
+ type: eduagarcia/tweetsentbr_fewshot
149
+ split: test
150
+ args:
151
+ num_few_shot: 25
152
+ metrics:
153
+ - type: f1_macro
154
+ value: 45.96
155
+ name: f1-macro
156
+ source:
157
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=cnmoro/Qwen2.5-0.5B-Portuguese-v1
158
+ name: Open Portuguese LLM Leaderboard
159
  ---
160
 
161
  Qwen2.5-0.5B finetuned for proficiency in Portuguese language and increased intelligence.
 
370
  * **Max Length:** 512
371
  * **Max Context Length** 480
372
  * **Max Generation Tokens:** 32
373
+ * **Effective Batch Size:** 1.0
374
+
375
+
376
+ # Open Portuguese LLM Leaderboard Evaluation Results
377
+
378
+ Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/cnmoro/Qwen2.5-0.5B-Portuguese-v1) and on the [🚀 Open Portuguese LLM Leaderboard](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)
379
+
380
+ | Metric | Value |
381
+ |--------------------------|---------|
382
+ |Average |**50.74**|
383
+ |ENEM Challenge (No Images)| 37.86|
384
+ |BLUEX (No Images) | 34.63|
385
+ |OAB Exams | 33.12|
386
+ |Assin2 RTE | 86.30|
387
+ |Assin2 STS | 54.30|
388
+ |FaQuAD NLI | 65.33|
389
+ |HateBR Binary | 44.06|
390
+ |PT Hate Speech Binary | 55.10|
391
+ |tweetSentBR | 45.96|
392
+