Daehoya
/

HyperCLOVAX-SEED-Counseling

+---
+language: ko
+tags:
+  - counseling
+  - korean
+  - chat
+  - empathy
+license: other
+datasets:
+  - custom
+pipeline_tag: text-generation
+---
+# 🧠 HyperCLOVAX-SEED-Counseling
+This model is a fine-tuned version of [`naver-hyperclovax/HyperCLOVAX-SEED-Text-Instruct-1.5B`](https://huggingface.co/naver-hyperclovax/HyperCLOVAX-SEED-Text-Instruct-1.5B), specialized for **empathetic counseling for teenagers**.
+---
+## 🔍 Model Overview
+- **Base model**: `naver-hyperclovax/HyperCLOVAX-SEED-Text-Instruct-1.5B`
+- **Fine-tuning objective**: Provide warm, non-judgmental, and emotionally supportive counseling responses tailored to youth clients.
+- **Language**: Korean (한국어)
+The model has been trained on real and synthetic conversations between counselors and teenage clients. It emphasizes:
+- Empathy and emotional validation (e.g., "그랬구나", "충분히 이해돼")
+- Open-ended questions for self-exploration
+- Avoiding direct advice or judgment
+- Handling crisis situations with safe referrals
+---
+## 🧑‍⚕️ System Prompt Guideline
+The system message used during training and inference is:
+```
+당신은 공감 능력이 뛰어난 전문 청소년 상담사입니다.
+(중략: 따뜻하고 공감적인 상담 대화 규칙 포함)
+```
+This ensures the assistant maintains a friendly, safe, and supportive tone.
+---
+## 💬 Inference Example
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+model = AutoModelForCausalLM.from_pretrained("Daehoya/HyperCLOVAX-SEED-Counseling")
+tokenizer = AutoTokenizer.from_pretrained("Daehoya/HyperCLOVAX-SEED-Counseling")
+prompt = "요즘 친구들과 멀어진 것 같아..."
+inputs = tokenizer.apply_chat_template(
+    [
+        {"role": "system", "content": "당신은 공감 능력이 뛰어난 전문 청소년 상담사입니다."},
+        {"role": "user", "content": prompt}
+    ],
+    return_tensors="pt"
+).to(model.device)
+output = model.generate(inputs, max_new_tokens=300, temperature=0.7)
+print(tokenizer.decode(output[0], skip_special_tokens=True))
+```
+---
+## 🧪 Training Details
+- **Optimizer**: AdamW
+- **Batch size**: 3 per device (gradient_accumulation_steps=20)
+- **Epochs**: 3
+- **Max input length**: up to 8192 tokens
+- **Hardware**: 4×A100 GPUs
+- **Precision**: FP16
+- **Framework**: `transformers.Trainer`
+---
+## 📁 Files Included
+- `pytorch_model.bin` or `model-*.safetensors`: Model weights
+- `tokenizer.json`, `tokenizer_config.json`: Tokenizer files
+- `config.json`: Model config
+- `generation_config.json`: Sampling configuration
+- `README.md`: This file
+---
+## 📜 License
+This model is released under the same license as the base model. Please review [NAVER CLOVA's licensing policy](https://huggingface.co/naver-hyperclovax).
+---
+## 🙏 Acknowledgements
+Thanks to NAVER CLOVA for the base model and the community for ongoing contributions in mental health AI.