RichardErkhov commited on
Commit
6da7aa8
·
verified ·
1 Parent(s): c87d8e5

uploaded readme

Browse files
Files changed (1) hide show
  1. README.md +169 -0
README.md ADDED
@@ -0,0 +1,169 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Quantization made by Richard Erkhov.
2
+
3
+ [Github](https://github.com/RichardErkhov)
4
+
5
+ [Discord](https://discord.gg/pvy7H8DZMG)
6
+
7
+ [Request more models](https://github.com/RichardErkhov/quant_request)
8
+
9
+
10
+ note - bnb 8bits
11
+ - Model creator: https://huggingface.co/jinee/
12
+ - Original model: https://huggingface.co/jinee/note/
13
+
14
+
15
+
16
+
17
+ Original model description:
18
+ ---
19
+ license: cc-by-nc-sa-4.0
20
+ language:
21
+ - en
22
+ tags:
23
+ - medical
24
+ ---
25
+
26
+
27
+ # NOTE: Notable generation Of patient Text summaries through Efficient approach based on direct preference optimization
28
+
29
+ The discharge summary (DS) is a crucial document in the patient journey, as it encompasses all events from multiple visits, medications, varied imaging/laboratory tests, surgery/procedures, and admissions/discharge.
30
+ Providing a summary of the patient’s progress is crucial, as it significantly influences future care and planning.
31
+ Consequently, clinicians face the laborious and resource-intensive task of manually collecting, organizing, and combining all the necessary data for a DS.
32
+ Therefore, we propose NOTE, which stands for “Notable generation Of patient Text summaries through an Efficient approach based on direct preference optimization (DPO)”.
33
+ NOTE is based on MIMIC-III and summarizes a single hospitalization of a patient. Patient events are sequentially combined and used to generate a DS for each hospitalization.
34
+ To demonstrate the practical application of the developed NOTE, we provide a web page-based demonstration software. In the future, we will aim to deploy the software available for actual use by clinicians in hospital.
35
+ NOTE can be utilized to generate various summaries not only discharge summaries but also throughout a patient's journey, thereby alleviating the labor-intensive workload of clinicians and aiming for increased efficiency.
36
+
37
+
38
+ ## Model Description
39
+
40
+ - **Model type:** MistralForCausalLM
41
+ - **Language(s) (NLP):** English
42
+ - **License:** [CC-BY-NC-SA](https://creativecommons.org/licenses/by-nc-sa/4.0/)
43
+ - **Finetuned from model:** [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
44
+
45
+ ## Model Sources
46
+
47
+ - **Paper:** [NOTE](https://arxiv.org/abs/2402.11882)
48
+ - **Demo:** [NOTE-DEMO](https://huggingface.co/spaces/jinee/note-demo)
49
+
50
+ ## Usage
51
+
52
+ ~~~python
53
+ import torch
54
+ from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
55
+ model = AutoModelForCausalLM.from_pretrained("jinee/note", load_in_4bit=True, device_map="auto")
56
+ tokenizer = AutoTokenizer.from_pretrained("jinee/note")
57
+ tokenizer.padding_side = 'right'
58
+ tokenizer.add_eos_token = True
59
+ tokenizer.pad_token = tokenizer.eos_token
60
+ tokenizer.add_eos_token, tokenizer.add_bos_token
61
+
62
+ instruction = '''
63
+ As a doctor, you need to create a discharge summary based on input data.
64
+ Never change the dates or numbers in the input data and use them as is. And please follow the format below for your report.
65
+ Also, never make up information that is not in the input data, and write a report only with information that can be identified from the input data.
66
+
67
+ 1. Patient information (SUBJECT_ID, HADM_ID, hospitalization and discharge date, hospitalization period, gender, date of birth, age, allergy)
68
+ 2. Diagnostic information and past history (if applicable)
69
+ 3. Surgery or procedure information
70
+ 4. Significant medication administration during hospitalization and discharge medication history
71
+ 5. Meaningful lab tests during hospitalization
72
+ 6. Summary of significant text records/notes
73
+ 7. Discharge outcomes and treatment plan
74
+ 8. Overall summary of at least 500 characters in lines including the above contents
75
+ '''
76
+ torch.cuda.empty_cache()
77
+
78
+ def generation(model, tokenizer, input_data):
79
+ pipe = pipeline('text-generation',
80
+ model = model,
81
+ tokenizer = tokenizer,
82
+ torch_dtype=torch.bfloat16,
83
+ device_map = 'auto')
84
+ global instruction
85
+
86
+ sequences = pipe(
87
+ f"[INST]{instruction}: {input_data} [/INST]",
88
+ do_sample=True,
89
+ max_new_tokens=1024,
90
+ temperature=0.7,
91
+ top_k=50,
92
+ top_p=0.95,
93
+ early_stopping =True,
94
+ num_return_sequences=1,)
95
+
96
+ text = sequences[0]['generated_text']
97
+ start_index = text.find('[/INST]')
98
+ if start_index != -1:
99
+ summary_ = text[start_index + len('[/INST]'):]
100
+ return(summary_)
101
+ else:
102
+ return("'[summary_] 'is not founded.")
103
+
104
+ ~~~
105
+
106
+
107
+ ## Dataset
108
+
109
+ The model has been trained on a [MIMIC-III](https://physionet.org/content/mimiciii/1.4/), a comprehensive and freely accssible de-identified medical database.
110
+ Access to this databased requires a number of steps to obtain permission.
111
+
112
+
113
+ ## Training and Hyper-parameters
114
+
115
+ ### List of LoRA config
116
+ based on [Parameter-Efficient Fine-Tuning (PEFT)](https://github.com/huggingface/peft)
117
+
118
+ Parameter | SFT | DPO
119
+ :------:| :------:| :------:
120
+ r | 16 | 16
121
+ lora alpha | 16 | 16
122
+ lora dropout | 0.05 | 0.05
123
+ target | q, k, v, o, gate | q, k, v, o, gate
124
+
125
+
126
+ ### List of Training arguments
127
+ based on [Transformer Reinforcement Learning (TRL)](https://github.com/huggingface/trl)
128
+
129
+ Parameter | SFT | DPO
130
+ :------:| :------:| :------:
131
+ early stopping patience | 3 | 3
132
+ early stopping threshold | 0.0005 | 0.0005
133
+ train epochs | 20 | 3
134
+ per device train batch size | 4 | 1
135
+ per device eval batch size | 8 (default) | 1
136
+ optimizer | paged adamw 8bit | paged adamw 8bit
137
+ lr scheduler | cosine | cosine
138
+ wramup ratio | 0.3 | 0.1
139
+ gradient accumulation step | 2 | 2
140
+ evaluation strategy | step | step
141
+ eval step | 10 | 5
142
+
143
+
144
+ ### Experimental setup
145
+ - **Ubuntu 20.04 LTS**
146
+ - **2 NVIDIA GeForce RTX 3090 GPUs**
147
+ - **Python**: 3.8.10
148
+ - **Pytorch**:2.0.1+cu118
149
+ - **Transformer**:4.35.2
150
+
151
+
152
+ ## Limitations
153
+
154
+ The model was limited in character count for comparison with the existing T5 model, but it is planned to be expanded in future research.
155
+ Additionally, further research on prompting engineering is needed due to it producing different results with the same instructions.
156
+ Most metrics for evaluating summarization and generation tasks were somewhat challenging to apply to our study, and while we attempted to address this through the ChatGPT4 Assistant API, future research will be based on feedback from clinicians.
157
+
158
+ ## Non-commercial use
159
+ These models are available exclusively for research purposes and are not intended for commercial use.
160
+
161
+ <!-- ## Citation
162
+
163
+ **BibTeX:**
164
+ -->
165
+
166
+
167
+ ## INMED DATA
168
+ INMED DATA is developing large language models (LLMs) specifically tailored for medical applications. For more information, please visit our website [TBD].
169
+