LLaMA-zhtw
Collection
6 items
β’
Updated
ε¨ Llama 3 δΈθ©¦ι©δΈζ Continue Pretraining (CP)οΌε ±θ¨θ¨η·΄ 800M tokensγ
η±ζΌδΈζι θ¨η·΄θͺζεθ³ͺιζζΉι²η©ΊιοΌCP εΎθ‘¨ηΎζͺθ½θΆ θΆεη Llama 3οΌζεζ―θΌεΉΎειζΊη€ΎηΎ€θ¨η·΄ηδΈζ Llama 3 δΉζι‘δΌΌηζ³γ
ε¨θ±ζζΉι’ LLaMA 3 zhtw δ½Ώη¨ FineWebοΌδ½ΏεΎ MMLU 葨ηΎι«ζΌε Άδ»δΈζCP樑εοΌθ½εθεη LLaMA 3 ζεΉ³γ
Models | β TMMLU+ (ACC) | CMMLU (ACC) | MMLU (ACC) | |
---|---|---|---|---|
TC, Knowledge | CN, Knowledge | EN, Knowledge | ||
5 shot | 5 shot | 5 shot | ||
Yi-6B | 6B | 49.63 | 75.53 | 65.35 |
Qwen-7B | 7B | 42.84 | 73.1 | 61.00 |
Meta-Llama-3-8B | 8B | 41.97 | 50.8 | 65.17 |
p208p2002/llama-3-zhtw-8B | 8B | 41.84 | 50.6 | 65.31 |
Breeze-7B-Base-v0_1 | 7B | 40.35 | 44.05 | 61.63 |
hfl/llama-3-chinese-8b | 8B | 39.64 | 50.9 | 61.1 |
Dataset | Lang | Weight |
---|---|---|
FineWeb | en | 0.35 |
Wudao | zh-cn | 0.1 |
C4Tw | zh-tw | 0.1 |
WikiZhTw | zh-tw | 0.15 |
NdltdT10 | zh-tw | 0.1 |
GitHubMarkDown | code | 0.1 |
GitHubPython | code | 0.1 |