--- license: mit datasets: - SNUH-HARI/MedicalLawQA language: - en - ko metrics: - accuracy - perplexity base_model: - UNIVA-Bllossom/DeepSeek-llama3.1-Bllossom-8B library_name: transformers tags: - medical - unsloth - trl - sft --- # SNUH-HARI/DeepSeek-llama3.1-HARI-8B ## Model Description **SNUH-HARI/DeepSeek-llama3.1-HARI-8B** is a fine-tuned version of **DeepSeek-llama3.1-Blossom** with **8 billion parameters**, optimized for **healthcare, legal, and multilingual applications**. Developed by **Healthcare AI Research Institute (HARI) at Seoul National University Hospital (SNUH)**, this model integrates **medical law and pseudonymized clinical data** to enhance **patient safety** and responsible AI in medicine. - **Architecture:** Transformer-based large language model (LLM) - **Languages:** English, Korean - **Primary Domains:** Healthcare, Legal, and General NLP - **Use Cases:** Medical and legal question answering, clinical decision support, patient safety applications ## Training Details - **Base Model:** DeepSeek-llama3.1 - **Fine-Tuned Datasets:** - **MedicalLawQA** (curated from [Korea Legislation Research Institute](https://elaw.klri.re.kr/eng_service/main.do) data using GPT-4o-mini) - **SNUH pseudonymized clinical notes** for real-world medical knowledge - **Optimization:** Mixed precision (FP16) for efficiency - **Compute Resources:** High-performance GPUs (e.g., NVIDIA H100 clusters) ## Intended Use This model is designed for **research, healthcare AI, and legal AI applications**. It is particularly suitable for: - **Medical and legal question answering** - **Clinical decision-making support** - **Healthcare policy and compliance** ## Limitations & Ethical Considerations - **Not a replacement for medical professionals:** Outputs should be validated by experts. - **Potential biases:** Legal and medical knowledge are jurisdiction-specific; users should verify regional applicability. - **Privacy compliance:** No personally identifiable information was used in training. ## Evaluation & Benchmarks - **Perplexity Score:** TBD - **Accuracy on legal-medical QA tasks:** TBD - **Comparison with baseline models (e.g., Med-PaLM, GPT-4-turbo):** TBD ## How to Use You can use the model via **Hugging Face Transformers**: ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "SNUH-HARI/DeepSeek-llama3.1-HARI-8B" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name) input_text = "What are the legal requirements for prescribing narcotics in South Korea?" input_ids = tokenizer(input_text, return_tensors="pt").input_ids output = model.generate(input_ids, max_length=1024) print(tokenizer.decode(output[0], skip_special_tokens=True)) ``` ## License This model is released under the **MIT License**. ## Citation If you use this model in your research, please cite: ``` @misc{SNUH-HARI-DeepSeek-llama3.1-HARI-8B, title={SNUH-HARI/DeepSeek-llama3.1-HARI-8B}, author={Healthcare AI Research Institute (HARI) at Seoul National University Hospital (SNUH)}, year={2025}, publisher={Hugging Face}, url={https://huggingface.co/Seoul National University Hospital (SNUH)-HARI/DeepSeek-llama3.1-HARI-8B} } ```