File size: 4,745 Bytes
d1c6c6d 14728f0 d1c6c6d 99b7f1a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 |
---
base_model: meta-llama/Llama-3.3-70B-Instruct
library_name: peft
license: mit
language:
- en
pipeline_tag: text-classification
tags:
- me
---
# Model Card for Model ID
Social determinants of health (SDoHs) are economic, social and personal
circumstances that affect or influence an individual's health status. This
model inferences as a multilabel classification at the sentence level and
supervised-fined on the Amended dataset from the paper "Integration of Large
Language Models and Traditional Deep Learning for Social Determinants of Health
Prediction" ([arXiv]).
The model is to be used on clinical text to classify zero or more SDoH labels.
Typical users of this model are clinical informatics physicians or biomedical
NLP researchers.
## Model Details
The model was trained the training and validation splits of the combined
MIMIC-III and synthetic datasets provided by [Guevara et AL.].
- **Developed by:** Paul Landes
- **Funded by [optional]:** Center for Health Equity using Machine Learning &
Artificial Intelligence (CHEMA) postdoctoral funding award.
- **Language(s) (NLP):** English
- **License:** MIT
- **Finetuned from model [optional]:** Llama 3.3 70B Instruct
## Usage
The model is used by inferencing with the supervised-fine tuned prompt, then
parsing the output. This creates a pipeline and uses the LLM to generate
output from the supervised fine tuned model. It then parses the output into
SDOH labels.
```python
import re
import torch
import transformers
# parse the LLM response
def parse_response(text):
res_regs = (re.compile(r'(?:.*?`([a-z,` ]{3,}`))', re.DOTALL),
re.compile(r'.*?[`#-]([a-z, \t\n\r]{3,}?)[`-].*', re.DOTALL))
matched: str = ''
for pat in res_regs:
m: re.Match = pat.match(text)
if m is not None:
matched = m.group(1)
break
return sorted(set(filter(lambda s: matched.find(s) > -1, labels)))
# the prompt and role used to supervised-fine tune the model
_PROMPT: str = """\
Classify sentences for social determinants of health (SDOH).
Definitions SDOHs are given with labels in back ticks:
* `housing`: The status of a patient’s housing is a critical SDOH, known to affect the outcome of treatment.
* `transportation`: This SDOH pertains to a patient’s inability to get to/from their healthcare visits.
* `relationship`: Whether or not a patient is in a partnered relationship is an abundant SDOH in the clinical notes.
* `parent`: This SDOH should be used for descriptions of a patient being a parent to at least one child who is a minor (under the age of 18 years old).
* `employment`: This SDOH pertains to expressions of a patient’s employment status. A sentence should be annotated as an Employment Status SDOH if it expresses if the patient is employed (a paid job), unemployed, retired, or a current student.
* `support`: This SDOH is a sentence describes a patient that is actively receiving care support, such as emotional, health, financial support. This support comes from family and friends but not health care professionals.
* `-`: If no SDOH is found.
Classify sentences for social determinants of health (SDOH) as a list labels in three back ticks. The sentence can be a member of multiple classes so output the labels that are mostly likely to be present.
### Sentence: {sent}
### SDOH labels:"""
role = 'You are a social determinants of health (SDOH) classifier.'
# output classes
labels = 'transportation housing relationship employment support parent'.split()
# example sentence
sent = 'Pt is homeless and has no car and has no parents or support'
# create a pipeline for inferencing
pipeline = transformers.pipeline(
'text-generation',
model='plandes/sdoh-llama-3-3-70b',
model_kwargs={'torch_dtype': torch.bfloat16},
device_map='auto')
# prompt used by the chat template
messages = [
{'role': 'system', 'content': 'You are a social determinants of health (SDOH) classifier.'},
{'role': 'user', 'content': _PROMPT.format(sent=sent)}]
# inference the LLM
outputs = pipeline(
messages,
max_new_tokens=512,
eos_token_id=[
pipeline.tokenizer.eos_token_id,
pipeline.tokenizer.convert_tokens_to_ids('<|eot_id|>'),
],
pad_token_id=pipeline.tokenizer.eos_token_id,
do_sample=True,
temperature=0.01)
# print the textual LLM output
output = outputs[0]['generated_text'][-1]['content']
print('model response:', output)
# print the parsed labels from the LLM outupt
print('labels:', parse_response(output))
```
## Citation [optional]
**BibTeX:**
[More Information Needed]
<!-- links -->
[Guevara et al.]: https://www.nature.com/articles/s41746-023-00970-0
[arXiv]: https://arxiv.org/pdf/2505.04655
|