kasadin commited on
Commit
0f67cd8
·
verified ·
1 Parent(s): df2c201

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -1
README.md CHANGED
@@ -65,7 +65,9 @@ YiXin-Distill-Qwen-72B was benchmarked against multiple models, including QwQ-32
65
 
66
  YiXin-Distill-Qwen-72B demonstrates significant improvements across mathematical reasoning and general knowledge tasks.
67
 
68
- ## Quickstart
 
 
69
 
70
  ```python
71
  from transformers import AutoModelForCausalLM, AutoTokenizer
@@ -97,6 +99,34 @@ generated_ids = [
97
  response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
98
  ```
99
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
100
  ## Limitations
101
 
102
  Despite its strong performance, YiXin-Distill-Qwen-72B has certain limitations:
 
65
 
66
  YiXin-Distill-Qwen-72B demonstrates significant improvements across mathematical reasoning and general knowledge tasks.
67
 
68
+ ## How to Run Locally
69
+
70
+ ### Hugging Face's Transformers
71
 
72
  ```python
73
  from transformers import AutoModelForCausalLM, AutoTokenizer
 
99
  response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
100
  ```
101
 
102
+ ### vLLM or SGLang
103
+
104
+ For instance, you can easily start a service using [vLLM](https://github.com/vllm-project/vllm):
105
+
106
+ ```shell
107
+ vllm serve YiXin-AILab/YiXin-Distill-Qwen-72B --tensor-parallel-size 4 --max-model-len 32768 --enforce-eager
108
+ ```
109
+
110
+ You can also easily start a service using [SGLang](https://github.com/sgl-project/sglang)
111
+
112
+ ```bash
113
+ python3 -m sglang.launch_server --model YiXin-AILab/YiXin-Distill-Qwen-72B --trust-remote-code --tp 4 --port 8000
114
+ ```
115
+
116
+ Then you can access the Chat API by:
117
+
118
+ ```bash
119
+ curl http://localhost:8000/v1/chat/completions \
120
+ -H "Content-Type: application/json" \
121
+ -d '{
122
+ "model": "YiXin-AILab/YiXin-Distill-Qwen-72B",
123
+ "messages": [
124
+ {"role": "system", "content": "You are a helpful and harmless assistant. You are Qwen developed by Alibaba. You should think step-by-step."},
125
+ {"role": "user", "content": "8+8=?"}
126
+ ]
127
+ }'
128
+ ```
129
+
130
  ## Limitations
131
 
132
  Despite its strong performance, YiXin-Distill-Qwen-72B has certain limitations: