PEFT
English
SeanLee97 commited on
Commit
30535c0
·
1 Parent(s): 01b9a25

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +73 -3
README.md CHANGED
@@ -1,9 +1,79 @@
1
  ---
2
  library_name: peft
 
 
 
 
 
 
 
3
  ---
4
- ## Training procedure
5
 
6
- ### Framework versions
7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
 
9
- - PEFT 0.5.0
 
1
  ---
2
  library_name: peft
3
+ license: mit
4
+ datasets:
5
+ - multi_nli
6
+ language:
7
+ - en
8
+ metrics:
9
+ - spearmanr
10
  ---
 
11
 
 
12
 
13
+ # AnglE📐: Angle-optimized Text Embeddings
14
+
15
+ > It is Angle 📐, not Angel 👼.
16
+
17
+ 🔥 A New SOTA Model for Semantic Textual Similarity!
18
+
19
+ Github: https://github.com/SeanLee97/AnglE
20
+
21
+ <a href="https://arxiv.org/abs/2309.12871">
22
+ <img src="https://img.shields.io/badge/Arxiv-2306.06843-yellow.svg?style=flat-square" alt="https://arxiv.org/abs/2309.12871" />
23
+ </a>
24
+
25
+ [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/angle-optimized-text-embeddings/semantic-textual-similarity-on-sick-r-1)](https://paperswithcode.com/sota/semantic-textual-similarity-on-sick-r-1?p=angle-optimized-text-embeddings)
26
+ [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/angle-optimized-text-embeddings/semantic-textual-similarity-on-sts16)](https://paperswithcode.com/sota/semantic-textual-similarity-on-sts16?p=angle-optimized-text-embeddings)
27
+ [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/angle-optimized-text-embeddings/semantic-textual-similarity-on-sts15)](https://paperswithcode.com/sota/semantic-textual-similarity-on-sts15?p=angle-optimized-text-embeddings)
28
+ [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/angle-optimized-text-embeddings/semantic-textual-similarity-on-sts14)](https://paperswithcode.com/sota/semantic-textual-similarity-on-sts14?p=angle-optimized-text-embeddings)
29
+ [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/angle-optimized-text-embeddings/semantic-textual-similarity-on-sts13)](https://paperswithcode.com/sota/semantic-textual-similarity-on-sts13?p=angle-optimized-text-embeddings)
30
+ [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/angle-optimized-text-embeddings/semantic-textual-similarity-on-sts12)](https://paperswithcode.com/sota/semantic-textual-similarity-on-sts12?p=angle-optimized-text-embeddings)
31
+ [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/angle-optimized-text-embeddings/semantic-textual-similarity-on-sts-benchmark)](https://paperswithcode.com/sota/semantic-textual-similarity-on-sts-benchmark?p=angle-optimized-text-embeddings)
32
+
33
+
34
+ **📝 Training Details:**
35
+
36
+
37
+ We fine-tuned AnglE-LLaMA using 4 RTX 3090 Ti (24GB), the training script is as follows:
38
+
39
+ ```bash
40
+ CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --nproc_per_node=4 --master_port=1234 train_angle.py \
41
+ --task NLI-STS --save_dir ckpts/NLI-STS-angle-llama-7b \
42
+ --w2 35 --learning_rate 2e-4 --maxlen 45 \
43
+ --lora_r 32 --lora_alpha 32 --lora_dropout 0.1 \
44
+ --save_steps 200 --batch_size 160 --seed 42 --do_eval 0 --load_kbit 4 --gradient_accumulation_steps 4 --epochs 1
45
+ ```
46
+
47
+ The evaluation script is as follows:
48
+
49
+ ```bash
50
+ CUDA_VISIBLE_DEVICES=0,1 python eval.py \
51
+ --load_kbit 16 \
52
+ --model_name_or_path NousResearch/Llama-2-7b-hf \
53
+ --lora_weight SeanLee97/angle-llama-7b-nli-20231027
54
+ ```
55
+
56
+
57
+ ## Usage
58
+
59
+ ```python
60
+ from transformers import AutoModelForCausalLM, AutoTokenizer
61
+ from peft import PeftModel, PeftConfig
62
+
63
+ peft_model_id = 'SeanLee97/angle-llama-7b-nli-20231027'
64
+ config = PeftConfig.from_pretrained(peft_model_id)
65
+ tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
66
+ model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path).bfloat16().cuda()
67
+ model = PeftModel.from_pretrained(model, peft_model_id).cuda()
68
+
69
+ def decorate_text(text: str):
70
+ return f'Summarize sentence "{text}" in one word:"'
71
+
72
+ inputs = 'hello world!'
73
+ tok = tokenizer([decorate_text(inputs)], return_tensors='pt')
74
+ for k, v in tok.items():
75
+ tok[k] = v.cuda()
76
+ vec = model(output_hidden_states=True, **tok).hidden_states[-1][:, -1].float().detach().cpu().numpy()
77
+ print(vec)
78
+ ```
79