Safetensors
qwen2_5_vl
syntheticbot commited on
Commit
eafad19
·
verified ·
1 Parent(s): 99c27b9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -10
README.md CHANGED
@@ -3,13 +3,13 @@ license: apache-2.0
3
  ---
4
 
5
 
6
- # syntheticbot/Qwen-VL-7B-ocr
7
 
8
 
9
 
10
  ## Introduction
11
 
12
- syntheticbot/Qwen-VL-7B-ocr is a fine-tuned model for Optical Character Recognition (OCR) tasks, derived from the base model [Qwen/Qwen2.5-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct). This model is engineered for high accuracy in extracting text from images, including documents and scenes containing text.
13
 
14
  #### Key Enhancements for OCR:
15
 
@@ -41,7 +41,7 @@ pip install git+https://github.com/huggingface/transformers accelerate
41
 
42
  ## Quickstart
43
 
44
- The following examples illustrate the use of syntheticbot/Qwen-VL-7B-ocr with 🤗 Transformers and `qwen_vl_utils` for OCR applications.
45
 
46
  ```
47
  pip install git+https://github.com/huggingface/transformers accelerate
@@ -61,12 +61,12 @@ from qwen_vl_utils import process_vision_info
61
  import torch
62
 
63
  model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
64
- "syntheticbot/Qwen-VL-7B-ocr",
65
  torch_dtype="auto",
66
  device_map="auto"
67
  )
68
 
69
- processor = AutoProcessor.from_pretrained("syntheticbot/Qwen-VL-7B-ocr")
70
 
71
  messages = [
72
  {
@@ -114,11 +114,11 @@ import torch
114
  import json
115
 
116
  model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
117
- "syntheticbot/Qwen-VL-7B-ocr",
118
  torch_dtype="auto",
119
  device_map="auto"
120
  )
121
- processor = AutoProcessor.from_pretrained("syntheticbot/Qwen-VL-7B-ocr")
122
 
123
 
124
  messages = [
@@ -215,7 +215,7 @@ print("Extracted Texts (Batch):\n", output_texts)
215
 
216
 
217
  ### 🤖 ModelScope
218
- For users in mainland China, ModelScope is recommended. Use `snapshot_download` for checkpoint management. Adapt model names to `syntheticbot/Qwen-VL-7B-ocr` in ModelScope implementations.
219
 
220
 
221
  ### More Usage Tips for OCR
@@ -223,7 +223,21 @@ For users in mainland China, ModelScope is recommended. Use `snapshot_download`
223
  Input images support local files, URLs, and base64 encoding.
224
 
225
  ```python
226
- messages = [ { "role": "user", "content": [ {"type": "image", "image": "http://path/to/your/document_image.jpg"}, {"type": "text", "text": "Extract the text from this image URL."}, ], }]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
227
  ```
228
  #### Image Resolution for OCR Accuracy
229
 
@@ -233,7 +247,7 @@ Higher resolution images typically improve OCR accuracy, especially for small te
233
  min_pixels = 512 * 28 * 28
234
  max_pixels = 2048 * 28 * 28
235
  processor = AutoProcessor.from_pretrained(
236
- "syntheticbot/Qwen-VL-7B-ocr",
237
  min_pixels=min_pixels, max_pixels=max_pixels
238
  )
239
  ```
 
3
  ---
4
 
5
 
6
+ # syntheticbot/ocr-qwen
7
 
8
 
9
 
10
  ## Introduction
11
 
12
+ syntheticbot/ocr-qwen is a fine-tuned model for Optical Character Recognition (OCR) tasks, derived from the base model [Qwen/Qwen2.5-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct). This model is engineered for high accuracy in extracting text from images, including documents and scenes containing text.
13
 
14
  #### Key Enhancements for OCR:
15
 
 
41
 
42
  ## Quickstart
43
 
44
+ The following examples illustrate the use of syntheticbot/ocr-qwen with 🤗 Transformers and `qwen_vl_utils` for OCR applications.
45
 
46
  ```
47
  pip install git+https://github.com/huggingface/transformers accelerate
 
61
  import torch
62
 
63
  model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
64
+ "syntheticbot/ocr-qwen",
65
  torch_dtype="auto",
66
  device_map="auto"
67
  )
68
 
69
+ processor = AutoProcessor.from_pretrained("syntheticbot/ocr-qwen")
70
 
71
  messages = [
72
  {
 
114
  import json
115
 
116
  model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
117
+ "syntheticbot/ocr-qwen",
118
  torch_dtype="auto",
119
  device_map="auto"
120
  )
121
+ processor = AutoProcessor.from_pretrained("syntheticbot/ocr-qwen")
122
 
123
 
124
  messages = [
 
215
 
216
 
217
  ### 🤖 ModelScope
218
+ For users in mainland China, ModelScope is recommended. Use `snapshot_download` for checkpoint management. Adapt model names to `syntheticbot/ocr-qwen` in ModelScope implementations.
219
 
220
 
221
  ### More Usage Tips for OCR
 
223
  Input images support local files, URLs, and base64 encoding.
224
 
225
  ```python
226
+ messages = [
227
+ {
228
+ "role": "user",
229
+ "content": [
230
+ {
231
+ "type": "image",
232
+ "image": "http://path/to/your/document_image.jpg"
233
+ },
234
+ {
235
+ "type": "text",
236
+ "text": "Extract the text from this image URL."
237
+ },
238
+ ],
239
+ }
240
+ ]
241
  ```
242
  #### Image Resolution for OCR Accuracy
243
 
 
247
  min_pixels = 512 * 28 * 28
248
  max_pixels = 2048 * 28 * 28
249
  processor = AutoProcessor.from_pretrained(
250
+ "syntheticbot/ocr-qwen",
251
  min_pixels=min_pixels, max_pixels=max_pixels
252
  )
253
  ```