Much cleaner API + FR paper + ack
Browse files
README.md
CHANGED
@@ -38,11 +38,11 @@ Clinical Mosaic is a transformer-based language model built on the Mosaic BERT a
|
|
38 |
|
39 |
## Model Details
|
40 |
- **Developed by:** Sifal Klioui, Sana Sellami, and Youssef Trardi (Aix-Marseille Univ, LIS, CNRS, Marseille, France)
|
41 |
-
- **Funded by:** PICOMALE project (AMIDEX)
|
42 |
- **Base Model:** Mosaic BERT
|
43 |
- **License:** MIMIC Data Use Agreement (requires compliance with original DUA)
|
44 |
- **Repository:** [PatientTrajectoryForecasting](https://github.com/MostHumble/PatientTrajectoryForecasting)
|
45 |
-
- **Paper:** *Patient Trajectory Prediction: Integrating Clinical Notes with Transformers*
|
46 |
|
47 |
## Uses
|
48 |
|
@@ -78,18 +78,11 @@ Install the Hugging Face Transformers library and load the model as follows:
|
|
78 |
### For embeddings generation:
|
79 |
|
80 |
```python
|
81 |
-
|
|
|
82 |
|
83 |
-
tokenizer =
|
84 |
-
|
85 |
-
|
86 |
-
ClincalMosaic = AutoModel.from_pretrained(
|
87 |
-
'Sifal/ClinicalMosaic',
|
88 |
-
config=config,
|
89 |
-
torch_dtype='auto',
|
90 |
-
trust_remote_code=True,
|
91 |
-
device_map="auto"
|
92 |
-
)
|
93 |
|
94 |
# Example usage
|
95 |
clinical_text = "..."
|
@@ -101,18 +94,12 @@ last_layer_embeddings = ClincalMosaic(**inputs, output_all_encoded_layers=False)
|
|
101 |
### For sequence classification:
|
102 |
|
103 |
```python
|
104 |
-
from transformers import AutoModelForSequenceClassification,
|
105 |
|
106 |
-
tokenizer =
|
107 |
-
config = BertConfig.from_pretrained('Sifal/ClinicalMosaic') # the config needs to be passed in
|
108 |
-
|
109 |
-
# Set the hidden size and number of labels:
|
110 |
-
config.num_labels = 4
|
111 |
-
config.hidden_size = 768
|
112 |
|
113 |
ClassifierClincalMosaic = AutoModelForSequenceClassification.from_pretrained(
|
114 |
'Sifal/ClinicalMosaic',
|
115 |
-
config=config,
|
116 |
torch_dtype='auto',
|
117 |
trust_remote_code=True,
|
118 |
device_map="auto"
|
@@ -185,13 +172,22 @@ The model demonstrates robust performance on clinical natural language inference
|
|
185 |
|
186 |
## Acknowledgments
|
187 |
|
188 |
-
We would like to thank
|
189 |
|
190 |
## Citation
|
191 |
|
192 |
**BibTeX:**
|
193 |
|
194 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
195 |
|
196 |
## More Information
|
197 |
|
|
|
38 |
|
39 |
## Model Details
|
40 |
- **Developed by:** Sifal Klioui, Sana Sellami, and Youssef Trardi (Aix-Marseille Univ, LIS, CNRS, Marseille, France)
|
41 |
+
- **Funded by:** PICOMALE project (AMIDEX) Under the direction of the CEDRE
|
42 |
- **Base Model:** Mosaic BERT
|
43 |
- **License:** MIMIC Data Use Agreement (requires compliance with original DUA)
|
44 |
- **Repository:** [PatientTrajectoryForecasting](https://github.com/MostHumble/PatientTrajectoryForecasting)
|
45 |
+
- **Paper:** *Patient Trajectory Prediction: Integrating Clinical Notes with Transformers* [[FR](https://editions-rnti.fr/?inprocid=1002990),[EN: to be added]()]
|
46 |
|
47 |
## Uses
|
48 |
|
|
|
78 |
### For embeddings generation:
|
79 |
|
80 |
```python
|
81 |
+
# Load model directly
|
82 |
+
from transformers import AutoTokenizer, AutoModel
|
83 |
|
84 |
+
tokenizer = AutoTokenizer.from_pretrained("Sifal/ClinicalMosaic", trust_remote_code=True)
|
85 |
+
ClincalMosaic = AutoModel.from_pretrained("Sifal/ClinicalMosaic", trust_remote_code=True)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
86 |
|
87 |
# Example usage
|
88 |
clinical_text = "..."
|
|
|
94 |
### For sequence classification:
|
95 |
|
96 |
```python
|
97 |
+
from transformers import AutoModelForSequenceClassification, AutoTokenizer
|
98 |
|
99 |
+
tokenizer = AutoTokenizer.from_pretrained('Sifal/ClinicalMosaic')
|
|
|
|
|
|
|
|
|
|
|
100 |
|
101 |
ClassifierClincalMosaic = AutoModelForSequenceClassification.from_pretrained(
|
102 |
'Sifal/ClinicalMosaic',
|
|
|
103 |
torch_dtype='auto',
|
104 |
trust_remote_code=True,
|
105 |
device_map="auto"
|
|
|
172 |
|
173 |
## Acknowledgments
|
174 |
|
175 |
+
We would like to thank **LIS** | Laboratoire d'Informatique et Systèmes, Aix-Marseille University for providing the GPU resources necessary for pretraining and conducting extensive experiments. Additionally, we acknowledge **CEDRE** | CEntre de formation et de soutien aux Données de la REcherche, Programme 2 du projet France 2030 IDeAL for supporting early-stage experiments and hosting part of the computational infrastructure.
|
176 |
|
177 |
## Citation
|
178 |
|
179 |
**BibTeX:**
|
180 |
|
181 |
+
```bibtex
|
182 |
+
@article{RNTI/papers/1002990,
|
183 |
+
author = {Sifal Klioui and Sana Sellami and Youssef Trardi},
|
184 |
+
title = {Prédiction de la trajectoire du patient : Intégration des notes cliniques aux transformers},
|
185 |
+
journal = {Revue des Nouvelles Technologies de l'Information},
|
186 |
+
volume = {Extraction et Gestion des Connaissances, RNTI-E-41},
|
187 |
+
year = {2025},
|
188 |
+
pages = {135-146}
|
189 |
+
}
|
190 |
+
```
|
191 |
|
192 |
## More Information
|
193 |
|