DongkiKim commited on
Commit
007076b
·
verified ·
1 Parent(s): 0620c79

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -3
README.md CHANGED
@@ -1,3 +1,52 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - DongkiKim/Mol-LLaMA-Instruct
5
+ language:
6
+ - en
7
+ base_model:
8
+ - meta-llama/Llama-3.1-8B-Instruct
9
+ tags:
10
+ - biology
11
+ - chemistry
12
+ - medical
13
+ ---
14
+
15
+ # Mol-Llama-3-8B-Instruct
16
+ [[Project Page](https://mol-llama.github.io/)] [[Paper](https://arxiv.org/abs/2502.13449)] [[GitHub](https://github.com/DongkiKim95/Mol-LLaMA)]
17
+
18
+ This repo contains the weights of Mol-LLaMA including the LoRA weights and projectors, based on [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct).
19
+
20
+ ## Architecture
21
+ ![image.png](architecture.png)
22
+ 1) Molecular encoders: Pretrained 2D encoder ([MoleculeSTM](https://huggingface.co/chao1224/MoleculeSTM)) and 3D encoder ([Uni-Mol](https://huggingface.co/dptech/Uni-Mol-Models))
23
+ 2) Blending Module: Combining complementary information from 2D and 3D encoders via cross-attention
24
+ 3) Q-Former: Embed molecular representations into query tokens based on [SciBERT](https://huggingface.co/allenai/scibert_scivocab_uncased)
25
+ 4) LoRA: Adapters for fine-tuning LLMs
26
+
27
+
28
+ ## Training Dataset
29
+
30
+ Mol-LLaMA is trained on [Mol-LLaMA-Instruct](https://huggingface.co/datasets/DongkiKim/Mol-LLaMA-Instruct), to learn the fundamental characteristics of molecules with the reasoning ability and explanbility.
31
+
32
+ ## How to Use
33
+
34
+ Please check out [the exemplar code for inference](https://github.com/DongkiKim95/Mol-LLaMA/blob/master/playground.py) in the Github repo.
35
+
36
+ ## Citation
37
+
38
+ If you find our model useful, please consider citing our work.
39
+ ```
40
+ @misc{kim2025molllama,
41
+ title={Mol-LLaMA: Towards General Understanding of Molecules in Large Molecular Language Model},
42
+ author={Dongki Kim and Wonbin Lee and Sung Ju Hwang},
43
+ year={2025},
44
+ eprint={2502.13449},
45
+ archivePrefix={arXiv},
46
+ primaryClass={cs.LG}
47
+ }
48
+ ```
49
+
50
+ ## Acknowledgements
51
+
52
+ We appreciate [LLaMA](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct), [3D-MoLM](https://huggingface.co/Sihangli/3D-MoLM), [MoleculeSTM](https://huggingface.co/chao1224/MoleculeSTM), [Uni-Mol](https://huggingface.co/dptech/Uni-Mol-Models) and [SciBERT](https://huggingface.co/allenai/scibert_scivocab_uncased) for their open-source contributions.