Daeho commited on
Commit
3a65dd4
ยท
1 Parent(s): e4539aa

Add model card for counseling model

Browse files
Files changed (1) hide show
  1. README.md +102 -0
README.md ADDED
@@ -0,0 +1,102 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: ko
3
+ tags:
4
+ - counseling
5
+ - korean
6
+ - chat
7
+ - empathy
8
+ license: other
9
+ datasets:
10
+ - custom
11
+ pipeline_tag: text-generation
12
+ ---
13
+
14
+ # ๐Ÿง  HyperCLOVAX-SEED-Counseling
15
+
16
+ This model is a fine-tuned version of [`naver-hyperclovax/HyperCLOVAX-SEED-Text-Instruct-1.5B`](https://huggingface.co/naver-hyperclovax/HyperCLOVAX-SEED-Text-Instruct-1.5B), specialized for **empathetic counseling for teenagers**.
17
+
18
+ ---
19
+
20
+ ## ๐Ÿ” Model Overview
21
+
22
+ - **Base model**: `naver-hyperclovax/HyperCLOVAX-SEED-Text-Instruct-1.5B`
23
+ - **Fine-tuning objective**: Provide warm, non-judgmental, and emotionally supportive counseling responses tailored to youth clients.
24
+ - **Language**: Korean (ํ•œ๊ตญ์–ด)
25
+
26
+ The model has been trained on real and synthetic conversations between counselors and teenage clients. It emphasizes:
27
+
28
+ - Empathy and emotional validation (e.g., "๊ทธ๋žฌ๊ตฌ๋‚˜", "์ถฉ๋ถ„ํžˆ ์ดํ•ด๋ผ")
29
+ - Open-ended questions for self-exploration
30
+ - Avoiding direct advice or judgment
31
+ - Handling crisis situations with safe referrals
32
+
33
+ ---
34
+
35
+ ## ๐Ÿง‘โ€โš•๏ธ System Prompt Guideline
36
+
37
+ The system message used during training and inference is:
38
+
39
+ ```
40
+ ๋‹น์‹ ์€ ๊ณต๊ฐ ๋Šฅ๋ ฅ์ด ๋›ฐ์–ด๋‚œ ์ „๋ฌธ ์ฒญ์†Œ๋…„ ์ƒ๋‹ด์‚ฌ์ž…๋‹ˆ๋‹ค.
41
+ (์ค‘๋žต: ๋”ฐ๋œปํ•˜๊ณ  ๊ณต๊ฐ์ ์ธ ์ƒ๋‹ด ๋Œ€ํ™” ๊ทœ์น™ ํฌํ•จ)
42
+ ```
43
+
44
+ This ensures the assistant maintains a friendly, safe, and supportive tone.
45
+
46
+ ---
47
+
48
+ ## ๐Ÿ’ฌ Inference Example
49
+
50
+ ```python
51
+ from transformers import AutoTokenizer, AutoModelForCausalLM
52
+
53
+ model = AutoModelForCausalLM.from_pretrained("Daehoya/HyperCLOVAX-SEED-Counseling")
54
+ tokenizer = AutoTokenizer.from_pretrained("Daehoya/HyperCLOVAX-SEED-Counseling")
55
+
56
+ prompt = "์š”์ฆ˜ ์นœ๊ตฌ๋“ค๊ณผ ๋ฉ€์–ด์ง„ ๊ฒƒ ๊ฐ™์•„..."
57
+
58
+ inputs = tokenizer.apply_chat_template(
59
+ [
60
+ {"role": "system", "content": "๋‹น์‹ ์€ ๊ณต๊ฐ ๋Šฅ๋ ฅ์ด ๋›ฐ์–ด๋‚œ ์ „๋ฌธ ์ฒญ์†Œ๋…„ ์ƒ๋‹ด์‚ฌ์ž…๋‹ˆ๋‹ค."},
61
+ {"role": "user", "content": prompt}
62
+ ],
63
+ return_tensors="pt"
64
+ ).to(model.device)
65
+
66
+ output = model.generate(inputs, max_new_tokens=300, temperature=0.7)
67
+ print(tokenizer.decode(output[0], skip_special_tokens=True))
68
+ ```
69
+
70
+ ---
71
+
72
+ ## ๐Ÿงช Training Details
73
+
74
+ - **Optimizer**: AdamW
75
+ - **Batch size**: 3 per device (gradient_accumulation_steps=20)
76
+ - **Epochs**: 3
77
+ - **Max input length**: up to 8192 tokens
78
+ - **Hardware**: 4ร—A100 GPUs
79
+ - **Precision**: FP16
80
+ - **Framework**: `transformers.Trainer`
81
+
82
+ ---
83
+
84
+ ## ๐Ÿ“ Files Included
85
+
86
+ - `pytorch_model.bin` or `model-*.safetensors`: Model weights
87
+ - `tokenizer.json`, `tokenizer_config.json`: Tokenizer files
88
+ - `config.json`: Model config
89
+ - `generation_config.json`: Sampling configuration
90
+ - `README.md`: This file
91
+
92
+ ---
93
+
94
+ ## ๐Ÿ“œ License
95
+
96
+ This model is released under the same license as the base model. Please review [NAVER CLOVA's licensing policy](https://huggingface.co/naver-hyperclovax).
97
+
98
+ ---
99
+
100
+ ## ๐Ÿ™ Acknowledgements
101
+
102
+ Thanks to NAVER CLOVA for the base model and the community for ongoing contributions in mental health AI.