--- license: llama2 language: - lt datasets: - uonlp/CulturaX --- # Model Card for Model ID Lt-Llama2 is a family of pretrained and fine-tuned generative text models for Lithuanian. This is the repository for the **foundational 7B model**. Links to other models can be found at the bottom of this page. ## Model Details ### Model Description Neurotechnology company marks the first open-source initiative dedicated to developing a large language model (LLM) specialized in Lithuanian. The company has created and publicly released a collection of Lithuanian LLMs, available both as foundational models and instructional variants. - **Developed by:** Neurotechnology - **Language(s):** Lithuanian - **License:** Llama2 Community License Agreement - **Continual pretrained from model:** [Llama-2-13b](https://huggingface.co/meta-llama/Llama-2-13b-hf) ### Model Sources - **Paper:** https://arxiv.org/abs/2408.12963 ## Intended Use ### Intended Use Cases Lt-Llama2 is designed for research purposes in Lithuanian. The base models can be tailored for various natural language tasks, while the instruction models are geared towards assistant-like conversational interactions. ### Prohibited use Utilizing the model in ways that breach the license, violate any applicable laws or regulations, or involve languages other than Lithuanian. ## How to Get Started with the Model Use the code below to get started with the model. ```python from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("neurotechnology/Lt-Llama-2-13b-hf") model = AutoModelForCausalLM.from_pretrained("neurotechnology/Lt-Llama-2-13b-hf") input_text = "Kartą gyveno senelis ir senelė " input_ids = tokenizer(input_text, return_tensors="pt") outputs = model.generate(**input_ids, max_new_tokens=100) print(tokenizer.decode(outputs[0])) ``` ## Benchmarks | Model | Average | ARC | MMLU |Winogrande|HellaSwag | GSM8k |TruthfulQA| |--------------------|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:| | Llama-2-13b | 30.53 | 28.66 | **31.34** | 50.90 | 28.91 | **5.91** | **37.48** | | *Llama2-13b-Base* | ***36.42*** | ***54.50*** | *26.01* | ***61.72*** | ***40.61*** | *0.45* | *35.23* | ## RoLlama2 Model Family | Model | Link | |--------------------|:--------:| |Lt-Llama2-7b | [link](https://huggingface.co/neurotechnology/Lt-Llama-2-7b-hf) | |Lt-Llama2-7b-instruct| [link](https://huggingface.co/neurotechnology/Lt-Llama-2-7b-instruct-hf) | |*Lt-Llama2-13b* | [link](https://huggingface.co/neurotechnology/Lt-Llama-2-13b-hf) | |Lt-Llama2-13b-instruct| [link](https://huggingface.co/neurotechnology/Lt-Llama-2-13b-instruct-hf) | ## Citation ```bibtext @misc{nakvosas2024openllama2modellithuanian, title={Open Llama2 Model for the Lithuanian Language}, author={Artūras Nakvosas and Povilas Daniušis and Vytas Mulevičius}, year={2024}, eprint={2408.12963}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2408.12963}, } ```