Model Card for Model ID

Social determinants of health (SDoHs) are economic, social and personal circumstances that affect or influence an individual's health status. This model inferences as a multilabel classification at the sentence level and supervised-fined on the Amended dataset from the paper "Integration of Large Language Models and Traditional Deep Learning for Social Determinants of Health Prediction" (arXiv).

The model is to be used on clinical text to classify zero or more SDoH labels. Typical users of this model are clinical informatics physicians or biomedical NLP researchers.

Model Details

The model was trained the training and validation splits of the combined MIMIC-III and synthetic datasets provided by Guevara et AL..

Developed by: Paul Landes
Funded by [optional]: Center for Health Equity using Machine Learning & Artificial Intelligence (CHEMA) postdoctoral funding award.
Language(s) (NLP): English
License: MIT
Finetuned from model [optional]: Llama 3.3 70B Instruct

Usage

The model is used by inferencing with the supervised-fine tuned prompt, then parsing the output. This creates a pipeline and uses the LLM to generate output from the supervised fine tuned model. It then parses the output into SDOH labels.

import re
import torch
import transformers


# parse the LLM response
def parse_response(text):
    res_regs = (re.compile(r'(?:.*?`([a-z,` ]{3,}`))', re.DOTALL),
                re.compile(r'.*?[`#-]([a-z, \t\n\r]{3,}?)[`-].*', re.DOTALL))
    matched: str = ''
    for pat in res_regs:
        m: re.Match = pat.match(text)
        if m is not None:
            matched = m.group(1)
            break
    return sorted(set(filter(lambda s: matched.find(s) > -1, labels)))


# the prompt and role used to supervised-fine tune the model
_PROMPT: str = """\
Classify sentences for social determinants of health (SDOH).

Definitions SDOHs are given with labels in back ticks:

* `housing`: The status of a patient’s housing is a critical SDOH, known to affect the outcome of treatment.

* `transportation`: This SDOH pertains to a patient’s inability to get to/from their healthcare visits.

* `relationship`: Whether or not a patient is in a partnered relationship is an abundant SDOH in the clinical notes.

* `parent`: This SDOH should be used for descriptions of a patient being a parent to at least one child who is a minor (under the age of 18 years old).

* `employment`: This SDOH pertains to expressions of a patient’s employment status. A sentence should be annotated as an Employment Status SDOH if it expresses if the patient is employed (a paid job), unemployed, retired, or a current student.

* `support`: This SDOH is a sentence describes a patient that is actively receiving care support, such as emotional, health, financial support.  This support comes from family and friends but not health care professionals.

* `-`: If no SDOH is found.

Classify sentences for social determinants of health (SDOH) as a list labels in three back ticks. The sentence can be a member of multiple classes so output the labels that are mostly likely to be present.

### Sentence: {sent}
### SDOH labels:"""
role = 'You are a social determinants of health (SDOH) classifier.'

# output classes
labels = 'transportation housing relationship employment support parent'.split()

# example sentence
sent = 'Pt is homeless and has no car and has no parents or support'

# create a pipeline for inferencing
pipeline = transformers.pipeline(
    'text-generation',
    model='plandes/sdoh-llama-3-3-70b',
    model_kwargs={'torch_dtype': torch.bfloat16},
    device_map='auto')

# prompt used by the chat template
messages = [
    {'role': 'system', 'content': 'You are a social determinants of health (SDOH) classifier.'},
    {'role': 'user', 'content': _PROMPT.format(sent=sent)}]

# inference the LLM
outputs = pipeline(
    messages,
    max_new_tokens=512,
    eos_token_id=[
        pipeline.tokenizer.eos_token_id,
        pipeline.tokenizer.convert_tokens_to_ids('<|eot_id|>'),
    ],
    pad_token_id=pipeline.tokenizer.eos_token_id,
    do_sample=True,
    temperature=0.01)

# print the textual LLM output
output = outputs[0]['generated_text'][-1]['content']
print('model response:', output)

# print the parsed labels from the LLM outupt
print('labels:', parse_response(output))

Citation [optional]

BibTeX:

[More Information Needed]

plandes
/

sdoh-llama-3-3-70b

Model Card for Model ID

Model Details

Usage

Citation [optional]

Model tree for plandes/sdoh-llama-3-3-70b