bruehle commited on
Commit
7b90e4f
·
verified ·
1 Parent(s): a6e0b41

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -12
README.md CHANGED
@@ -19,7 +19,7 @@ datasets:
19
 
20
  <!-- Provide a quick summary of what the model is/does. -->
21
 
22
- This model is part of [this](https://chemrxiv.org/engage/chemrxiv/article-details/67adc1fc81d2151a0244de56) publication. It is used for translating chemical synthesis procedures given in natural language (en) to "action graphs", i.e., a simple markup language listing synthesis actions from a pre-defined controlled vocabulary along with the process parameters.
23
 
24
  ## Model Details
25
 
@@ -29,7 +29,7 @@ The model was fine-tuned on a dataset containing chemical synthesis procedures f
29
 
30
 
31
  - **Developed by:** Bastian Ruehle
32
- - **Funded by:** [Federal Institute fo Materials Research and Testing (BAM)](www.bam.de)
33
  - **Model type:** LED (Longformer Encoder-Decoder)
34
  - **Language(s) (NLP):** en
35
  - **License:** [MIT](https://opensource.org/license/mit)
@@ -40,11 +40,11 @@ The model was fine-tuned on a dataset containing chemical synthesis procedures f
40
  <!-- Provide the basic links for the model. -->
41
 
42
  - **Repository:** The repository accompanying this model can be found [here](https://github.com/BAMresearch/MAPz_at_BAM/tree/main/Minerva-Workflow-Generator)
43
- - **Paper:** The papers accompanying this model can be found [here](https://chemrxiv.org/engage/chemrxiv/article-details/67adc1fc81d2151a0244de56) and [here](https://pubs.acs.org/doi/full/10.1021/acsnano.4c17504)
44
 
45
  ## Uses
46
 
47
- The model is integrated into a [node editor app](https://chemrxiv.org/engage/chemrxiv/article-details/67adc1fc81d2151a0244de56) for generating workflows from synthesis procedures given in natural language for the Self-Driving Lab platform [Minerva](https://pubs.acs.org/doi/full/10.1021/acsnano.4c17504).
48
 
49
  ### Direct Use
50
 
@@ -56,7 +56,7 @@ Even though it is not the intended way of using the model, it can be used "stand
56
 
57
  <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
58
 
59
- The model was intended to be used with the [node editor app](https://chemrxiv.org/engage/chemrxiv/article-details/67adc1fc81d2151a0244de56) for the Self-Driving Lab platform [Minerva](https://pubs.acs.org/doi/full/10.1021/acsnano.4c17504).
60
 
61
  ### Out-of-Scope Use
62
 
@@ -98,9 +98,9 @@ if __name__ == '__main__':
98
  rawtext = """<Insert your Synthesis Procedure here>"""
99
 
100
  # model_id = 'bruehle/BigBirdPegasus_Llama'
101
- # model_id = 'bruehle/LED-Base-16384_Llama'
102
  # model_id = 'bruehle/BigBirdPegasus_Chemtagger'
103
- model_id = 'bruehle/LED-Base-16384_Chemtagger' # or use any of the other models
104
 
105
  if 'BigBirdPegasus' in model_id:
106
  max_length = 512
@@ -128,7 +128,7 @@ Models were trained on A100-80GB GPUs for 885’225 steps (5 epochs) on the trai
128
 
129
  #### Preprocessing
130
 
131
- More information on data pre- and postprocessing can be found [here](https://chemrxiv.org/engage/chemrxiv/article-details/67adc1fc81d2151a0244de56).
132
 
133
 
134
  #### Training Hyperparameters
@@ -145,7 +145,7 @@ More information on data pre- and postprocessing can be found [here](https://che
145
 
146
  <!-- This should link to a Dataset Card if possible. -->
147
 
148
- Example outputs for experimental procedures from the domains of materials science, organic chemistry, inorganic chemistry, and a patent that were not part of the training or evaluation dataset can be found [here](https://chemrxiv.org/engage/chemrxiv/article-details/67adc1fc81d2151a0244de56).
149
 
150
  ## Technical Specifications
151
 
@@ -155,7 +155,7 @@ Longformer Encoder-Decoder Model for Text2Text/Seq2Seq Generation.
155
 
156
  ### Compute Infrastructure
157
 
158
- Trained on HPC GPU nodes of the [Federal Institute fo Materials Research and Testing (BAM)](www.bam.de).
159
 
160
  #### Hardware
161
 
@@ -171,13 +171,13 @@ Python 3.12
171
 
172
  **BibTeX:**
173
 
174
- @article{Ruehle_2025, title={Natural Language Processing for Automated Workflow and Knowledge Graph Generation in Self-Driving Labs}, DOI={10.26434/chemrxiv-2025-0p7xx}, journal={ChemRxiv}, author={Ruehle, Bastian}, year={2025}}
175
 
176
  @article{doi:10.1021/acsnano.4c17504, author = {Zaki, Mohammad and Prinz, Carsten and Ruehle, Bastian}, title = {A Self-Driving Lab for Nano- and Advanced Materials Synthesis}, journal = {ACS Nano}, volume = {19}, number = {9}, pages = {9029-9041}, year = {2025}, doi = {10.1021/acsnano.4c17504}, note ={PMID: 39995288}, URL = {https://doi.org/10.1021/acsnano.4c17504}, eprint = {https://doi.org/10.1021/acsnano.4c17504}}
177
 
178
  **APA:**
179
 
180
- Ruehle, B. (2025). Natural Language Processing for Automated Workflow and Knowledge Graph Generation in Self-Driving Labs. ChemRxiv. doi:10.26434/chemrxiv-2025-0p7xx
181
 
182
  Zaki, M., Prinz, C. & Ruehle, B. (2025). A Self-Driving Lab for Nano- and Advanced Materials Synthesis. ACS Nano, 19(9), 9029-9041. doi:10.1021/acsnano.4c17504
183
 
 
19
 
20
  <!-- Provide a quick summary of what the model is/does. -->
21
 
22
+ This model is part of [this](https://pubs.rsc.org/en/Content/ArticleLanding/2025/DD/D5DD00063G) publication. It is used for translating chemical synthesis procedures given in natural language (en) to "action graphs", i.e., a simple markup language listing synthesis actions from a pre-defined controlled vocabulary along with the process parameters.
23
 
24
  ## Model Details
25
 
 
29
 
30
 
31
  - **Developed by:** Bastian Ruehle
32
+ - **Funded by:** [Federal Institute fo Materials Research and Testing (BAM)](https://www.bam.de)
33
  - **Model type:** LED (Longformer Encoder-Decoder)
34
  - **Language(s) (NLP):** en
35
  - **License:** [MIT](https://opensource.org/license/mit)
 
40
  <!-- Provide the basic links for the model. -->
41
 
42
  - **Repository:** The repository accompanying this model can be found [here](https://github.com/BAMresearch/MAPz_at_BAM/tree/main/Minerva-Workflow-Generator)
43
+ - **Paper:** The papers accompanying this model can be found [here](https://pubs.rsc.org/en/Content/ArticleLanding/2025/DD/D5DD00063G) and [here](https://pubs.acs.org/doi/full/10.1021/acsnano.4c17504)
44
 
45
  ## Uses
46
 
47
+ The model is integrated into a [node editor app](https://pubs.rsc.org/en/Content/ArticleLanding/2025/DD/D5DD00063G) for generating workflows from synthesis procedures given in natural language for the Self-Driving Lab platform [Minerva](https://pubs.acs.org/doi/full/10.1021/acsnano.4c17504).
48
 
49
  ### Direct Use
50
 
 
56
 
57
  <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
58
 
59
+ The model was intended to be used with the [node editor app](https://pubs.rsc.org/en/Content/ArticleLanding/2025/DD/D5DD00063G) for the Self-Driving Lab platform [Minerva](https://pubs.acs.org/doi/full/10.1021/acsnano.4c17504).
60
 
61
  ### Out-of-Scope Use
62
 
 
98
  rawtext = """<Insert your Synthesis Procedure here>"""
99
 
100
  # model_id = 'bruehle/BigBirdPegasus_Llama'
101
+ model_id = 'bruehle/LED-Base-16384_Llama' # or use any of the other models
102
  # model_id = 'bruehle/BigBirdPegasus_Chemtagger'
103
+ # model_id = 'bruehle/LED-Base-16384_Chemtagger'
104
 
105
  if 'BigBirdPegasus' in model_id:
106
  max_length = 512
 
128
 
129
  #### Preprocessing
130
 
131
+ More information on data pre- and postprocessing can be found [here](https://pubs.rsc.org/en/Content/ArticleLanding/2025/DD/D5DD00063G).
132
 
133
 
134
  #### Training Hyperparameters
 
145
 
146
  <!-- This should link to a Dataset Card if possible. -->
147
 
148
+ Example outputs for experimental procedures from the domains of materials science, organic chemistry, inorganic chemistry, and a patent that were not part of the training or evaluation dataset can be found [here](https://pubs.rsc.org/en/Content/ArticleLanding/2025/DD/D5DD00063G).
149
 
150
  ## Technical Specifications
151
 
 
155
 
156
  ### Compute Infrastructure
157
 
158
+ Trained on HPC GPU nodes of the [Federal Institute fo Materials Research and Testing (BAM)](https://www.bam.de).
159
 
160
  #### Hardware
161
 
 
171
 
172
  **BibTeX:**
173
 
174
+ @article{Ruehle_2025, title={Natural Language Processing for Automated Workflow and Knowledge Graph Generation in Self-Driving Labs}, DOI={10.1039/D5DD00063G}, journal={DigitalDiscovery}, author={Ruehle, Bastian}, year={2025}}
175
 
176
  @article{doi:10.1021/acsnano.4c17504, author = {Zaki, Mohammad and Prinz, Carsten and Ruehle, Bastian}, title = {A Self-Driving Lab for Nano- and Advanced Materials Synthesis}, journal = {ACS Nano}, volume = {19}, number = {9}, pages = {9029-9041}, year = {2025}, doi = {10.1021/acsnano.4c17504}, note ={PMID: 39995288}, URL = {https://doi.org/10.1021/acsnano.4c17504}, eprint = {https://doi.org/10.1021/acsnano.4c17504}}
177
 
178
  **APA:**
179
 
180
+ Ruehle, B. (2025). Natural Language Processing for Automated Workflow and Knowledge Graph Generation in Self-Driving Labs. DigitalDiscovery. doi:10.1039/D5DD00063G
181
 
182
  Zaki, M., Prinz, C. & Ruehle, B. (2025). A Self-Driving Lab for Nano- and Advanced Materials Synthesis. ACS Nano, 19(9), 9029-9041. doi:10.1021/acsnano.4c17504
183