---
license: mit
datasets:
- SNUH-HARI/MedicalLawQA
language:
- en
- ko
metrics:
- accuracy
- perplexity
base_model:
- UNIVA-Bllossom/DeepSeek-llama3.1-Bllossom-8B
library_name: transformers
tags:
- medical
- unsloth
- trl
- sft
---

# SNUH-HARI/DeepSeek-llama3.1-HARI-8B

## Model Description
**SNUH-HARI/DeepSeek-llama3.1-HARI-8B** is a fine-tuned version of **DeepSeek-llama3.1-Blossom** with **8 billion parameters**, optimized for **healthcare, legal, and multilingual applications**. Developed by **Healthcare AI Research Institute (HARI) at Seoul National University Hospital (SNUH)**, this model integrates **medical law and pseudonymized clinical data** to enhance **patient safety** and responsible AI in medicine.

- **Architecture:** Transformer-based large language model (LLM)
- **Languages:** English, Korean
- **Primary Domains:** Healthcare, Legal, and General NLP
- **Use Cases:** Medical and legal question answering, clinical decision support, patient safety applications

## Training Details
- **Base Model:** DeepSeek-llama3.1
- **Fine-Tuned Datasets:**
  - **MedicalLawQA** (curated from [Korea Legislation Research Institute](https://elaw.klri.re.kr/eng_service/main.do) data using GPT-4o-mini)
  - **SNUH pseudonymized clinical notes** for real-world medical knowledge
- **Optimization:** Mixed precision (FP16) for efficiency
- **Compute Resources:** High-performance GPUs (e.g., NVIDIA H100 clusters)

## Intended Use
This model is designed for **research, healthcare AI, and legal AI applications**. It is particularly suitable for:
- **Medical and legal question answering**
- **Clinical decision-making support**
- **Healthcare policy and compliance**

## Limitations & Ethical Considerations
- **Not a replacement for medical professionals:** Outputs should be validated by experts.
- **Potential biases:** Legal and medical knowledge are jurisdiction-specific; users should verify regional applicability.
- **Privacy compliance:** No personally identifiable information was used in training.

## Evaluation & Benchmarks
- **Perplexity Score:** TBD
- **Accuracy on legal-medical QA tasks:** TBD
- **Comparison with baseline models (e.g., Med-PaLM, GPT-4-turbo):** TBD

## How to Use
You can use the model via **Hugging Face Transformers**:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "SNUH-HARI/DeepSeek-llama3.1-HARI-8B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

input_text = "What are the legal requirements for prescribing narcotics in South Korea?"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids

output = model.generate(input_ids, max_length=1024)
print(tokenizer.decode(output[0], skip_special_tokens=True))
```

## License
This model is released under the **MIT License**.

## Citation
If you use this model in your research, please cite:

```
@misc{SNUH-HARI-DeepSeek-llama3.1-HARI-8B,
  title={SNUH-HARI/DeepSeek-llama3.1-HARI-8B},
  author={Healthcare AI Research Institute (HARI) at Seoul National University Hospital (SNUH)},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/Seoul National University Hospital (SNUH)-HARI/DeepSeek-llama3.1-HARI-8B}
}
```