Update README.md
Browse files
README.md
CHANGED
@@ -19,7 +19,7 @@ datasets:
|
|
19 |
|
20 |
<!-- Provide a quick summary of what the model is/does. -->
|
21 |
|
22 |
-
This model is part of [this](https://
|
23 |
|
24 |
## Model Details
|
25 |
|
@@ -29,7 +29,7 @@ The model was fine-tuned on a dataset containing chemical synthesis procedures f
|
|
29 |
|
30 |
|
31 |
- **Developed by:** Bastian Ruehle
|
32 |
-
- **Funded by:** [Federal Institute fo Materials Research and Testing (BAM)](www.bam.de)
|
33 |
- **Model type:** LED (Longformer Encoder-Decoder)
|
34 |
- **Language(s) (NLP):** en
|
35 |
- **License:** [MIT](https://opensource.org/license/mit)
|
@@ -40,11 +40,11 @@ The model was fine-tuned on a dataset containing chemical synthesis procedures f
|
|
40 |
<!-- Provide the basic links for the model. -->
|
41 |
|
42 |
- **Repository:** The repository accompanying this model can be found [here](https://github.com/BAMresearch/MAPz_at_BAM/tree/main/Minerva-Workflow-Generator)
|
43 |
-
- **Paper:** The papers accompanying this model can be found [here](https://
|
44 |
|
45 |
## Uses
|
46 |
|
47 |
-
The model is integrated into a [node editor app](https://
|
48 |
|
49 |
### Direct Use
|
50 |
|
@@ -56,7 +56,7 @@ Even though it is not the intended way of using the model, it can be used "stand
|
|
56 |
|
57 |
<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
|
58 |
|
59 |
-
The model was intended to be used with the [node editor app](https://
|
60 |
|
61 |
### Out-of-Scope Use
|
62 |
|
@@ -98,9 +98,9 @@ if __name__ == '__main__':
|
|
98 |
rawtext = """<Insert your Synthesis Procedure here>"""
|
99 |
|
100 |
# model_id = 'bruehle/BigBirdPegasus_Llama'
|
101 |
-
|
102 |
# model_id = 'bruehle/BigBirdPegasus_Chemtagger'
|
103 |
-
model_id = 'bruehle/LED-Base-16384_Chemtagger'
|
104 |
|
105 |
if 'BigBirdPegasus' in model_id:
|
106 |
max_length = 512
|
@@ -128,7 +128,7 @@ Models were trained on A100-80GB GPUs for 885’225 steps (5 epochs) on the trai
|
|
128 |
|
129 |
#### Preprocessing
|
130 |
|
131 |
-
More information on data pre- and postprocessing can be found [here](https://
|
132 |
|
133 |
|
134 |
#### Training Hyperparameters
|
@@ -145,7 +145,7 @@ More information on data pre- and postprocessing can be found [here](https://che
|
|
145 |
|
146 |
<!-- This should link to a Dataset Card if possible. -->
|
147 |
|
148 |
-
Example outputs for experimental procedures from the domains of materials science, organic chemistry, inorganic chemistry, and a patent that were not part of the training or evaluation dataset can be found [here](https://
|
149 |
|
150 |
## Technical Specifications
|
151 |
|
@@ -155,7 +155,7 @@ Longformer Encoder-Decoder Model for Text2Text/Seq2Seq Generation.
|
|
155 |
|
156 |
### Compute Infrastructure
|
157 |
|
158 |
-
Trained on HPC GPU nodes of the [Federal Institute fo Materials Research and Testing (BAM)](www.bam.de).
|
159 |
|
160 |
#### Hardware
|
161 |
|
@@ -171,13 +171,13 @@ Python 3.12
|
|
171 |
|
172 |
**BibTeX:**
|
173 |
|
174 |
-
@article{Ruehle_2025, title={Natural Language Processing for Automated Workflow and Knowledge Graph Generation in Self-Driving Labs}, DOI={10.
|
175 |
|
176 |
@article{doi:10.1021/acsnano.4c17504, author = {Zaki, Mohammad and Prinz, Carsten and Ruehle, Bastian}, title = {A Self-Driving Lab for Nano- and Advanced Materials Synthesis}, journal = {ACS Nano}, volume = {19}, number = {9}, pages = {9029-9041}, year = {2025}, doi = {10.1021/acsnano.4c17504}, note ={PMID: 39995288}, URL = {https://doi.org/10.1021/acsnano.4c17504}, eprint = {https://doi.org/10.1021/acsnano.4c17504}}
|
177 |
|
178 |
**APA:**
|
179 |
|
180 |
-
Ruehle, B. (2025). Natural Language Processing for Automated Workflow and Knowledge Graph Generation in Self-Driving Labs.
|
181 |
|
182 |
Zaki, M., Prinz, C. & Ruehle, B. (2025). A Self-Driving Lab for Nano- and Advanced Materials Synthesis. ACS Nano, 19(9), 9029-9041. doi:10.1021/acsnano.4c17504
|
183 |
|
|
|
19 |
|
20 |
<!-- Provide a quick summary of what the model is/does. -->
|
21 |
|
22 |
+
This model is part of [this](https://pubs.rsc.org/en/Content/ArticleLanding/2025/DD/D5DD00063G) publication. It is used for translating chemical synthesis procedures given in natural language (en) to "action graphs", i.e., a simple markup language listing synthesis actions from a pre-defined controlled vocabulary along with the process parameters.
|
23 |
|
24 |
## Model Details
|
25 |
|
|
|
29 |
|
30 |
|
31 |
- **Developed by:** Bastian Ruehle
|
32 |
+
- **Funded by:** [Federal Institute fo Materials Research and Testing (BAM)](https://www.bam.de)
|
33 |
- **Model type:** LED (Longformer Encoder-Decoder)
|
34 |
- **Language(s) (NLP):** en
|
35 |
- **License:** [MIT](https://opensource.org/license/mit)
|
|
|
40 |
<!-- Provide the basic links for the model. -->
|
41 |
|
42 |
- **Repository:** The repository accompanying this model can be found [here](https://github.com/BAMresearch/MAPz_at_BAM/tree/main/Minerva-Workflow-Generator)
|
43 |
+
- **Paper:** The papers accompanying this model can be found [here](https://pubs.rsc.org/en/Content/ArticleLanding/2025/DD/D5DD00063G) and [here](https://pubs.acs.org/doi/full/10.1021/acsnano.4c17504)
|
44 |
|
45 |
## Uses
|
46 |
|
47 |
+
The model is integrated into a [node editor app](https://pubs.rsc.org/en/Content/ArticleLanding/2025/DD/D5DD00063G) for generating workflows from synthesis procedures given in natural language for the Self-Driving Lab platform [Minerva](https://pubs.acs.org/doi/full/10.1021/acsnano.4c17504).
|
48 |
|
49 |
### Direct Use
|
50 |
|
|
|
56 |
|
57 |
<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
|
58 |
|
59 |
+
The model was intended to be used with the [node editor app](https://pubs.rsc.org/en/Content/ArticleLanding/2025/DD/D5DD00063G) for the Self-Driving Lab platform [Minerva](https://pubs.acs.org/doi/full/10.1021/acsnano.4c17504).
|
60 |
|
61 |
### Out-of-Scope Use
|
62 |
|
|
|
98 |
rawtext = """<Insert your Synthesis Procedure here>"""
|
99 |
|
100 |
# model_id = 'bruehle/BigBirdPegasus_Llama'
|
101 |
+
model_id = 'bruehle/LED-Base-16384_Llama' # or use any of the other models
|
102 |
# model_id = 'bruehle/BigBirdPegasus_Chemtagger'
|
103 |
+
# model_id = 'bruehle/LED-Base-16384_Chemtagger'
|
104 |
|
105 |
if 'BigBirdPegasus' in model_id:
|
106 |
max_length = 512
|
|
|
128 |
|
129 |
#### Preprocessing
|
130 |
|
131 |
+
More information on data pre- and postprocessing can be found [here](https://pubs.rsc.org/en/Content/ArticleLanding/2025/DD/D5DD00063G).
|
132 |
|
133 |
|
134 |
#### Training Hyperparameters
|
|
|
145 |
|
146 |
<!-- This should link to a Dataset Card if possible. -->
|
147 |
|
148 |
+
Example outputs for experimental procedures from the domains of materials science, organic chemistry, inorganic chemistry, and a patent that were not part of the training or evaluation dataset can be found [here](https://pubs.rsc.org/en/Content/ArticleLanding/2025/DD/D5DD00063G).
|
149 |
|
150 |
## Technical Specifications
|
151 |
|
|
|
155 |
|
156 |
### Compute Infrastructure
|
157 |
|
158 |
+
Trained on HPC GPU nodes of the [Federal Institute fo Materials Research and Testing (BAM)](https://www.bam.de).
|
159 |
|
160 |
#### Hardware
|
161 |
|
|
|
171 |
|
172 |
**BibTeX:**
|
173 |
|
174 |
+
@article{Ruehle_2025, title={Natural Language Processing for Automated Workflow and Knowledge Graph Generation in Self-Driving Labs}, DOI={10.1039/D5DD00063G}, journal={DigitalDiscovery}, author={Ruehle, Bastian}, year={2025}}
|
175 |
|
176 |
@article{doi:10.1021/acsnano.4c17504, author = {Zaki, Mohammad and Prinz, Carsten and Ruehle, Bastian}, title = {A Self-Driving Lab for Nano- and Advanced Materials Synthesis}, journal = {ACS Nano}, volume = {19}, number = {9}, pages = {9029-9041}, year = {2025}, doi = {10.1021/acsnano.4c17504}, note ={PMID: 39995288}, URL = {https://doi.org/10.1021/acsnano.4c17504}, eprint = {https://doi.org/10.1021/acsnano.4c17504}}
|
177 |
|
178 |
**APA:**
|
179 |
|
180 |
+
Ruehle, B. (2025). Natural Language Processing for Automated Workflow and Knowledge Graph Generation in Self-Driving Labs. DigitalDiscovery. doi:10.1039/D5DD00063G
|
181 |
|
182 |
Zaki, M., Prinz, C. & Ruehle, B. (2025). A Self-Driving Lab for Nano- and Advanced Materials Synthesis. ACS Nano, 19(9), 9029-9041. doi:10.1021/acsnano.4c17504
|
183 |
|