hassenhamdi commited on
Commit
ca4c80f
·
verified ·
1 Parent(s): a727116

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +53 -0
README.md CHANGED
@@ -15,3 +15,56 @@ tags:
15
  - Original model: [ibm-granite/granite-vision-3.1-2b-preview](https://huggingface.co/ibm-granite/granite-vision-3.1-2b-preview)
16
  - precision: 4-bit
17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  - Original model: [ibm-granite/granite-vision-3.1-2b-preview](https://huggingface.co/ibm-granite/granite-vision-3.1-2b-preview)
16
  - precision: 4-bit
17
 
18
+ ## Setup
19
+ - You can run the quantized model with these steps:
20
+
21
+ - Check requirements from the original. In particular, check python, cuda, and transformers versions.
22
+
23
+ - Make sure that you have installed quantization related packages.
24
+ ```bash
25
+ pip install bitsandbytes>=0.39.0
26
+ pip install --upgrade accelerate transformers
27
+ ```
28
+
29
+ - Load & run the model.
30
+ ```python
31
+ from transformers import AutoProcessor, AutoModelForVision2Seq
32
+ from huggingface_hub import hf_hub_download
33
+ import torch
34
+
35
+ device = "cuda" if torch.cuda.is_available() else "cpu"
36
+
37
+
38
+ model = AutoModelForVision2Seq.from_pretrained('hassenhamdi/granite-vision-3.1-2b-preview-4bit', trust_remote_code=True).to(device)
39
+ tokenizer = AutoProcessor.from_pretrained('ibm-granite/granite-vision-3.1-2b-preview')
40
+
41
+
42
+ # prepare image and text prompt, using the appropriate prompt template
43
+
44
+ img_path = hf_hub_download(repo_id=model_path, filename='example.png')
45
+
46
+ conversation = [
47
+ {
48
+ "role": "user",
49
+ "content": [
50
+ {"type": "image", "url": img_path},
51
+ {"type": "text", "text": "What is the highest scoring model on ChartQA and what is its score?"},
52
+ ],
53
+ },
54
+ ]
55
+ inputs = processor.apply_chat_template(
56
+ conversation,
57
+ add_generation_prompt=True,
58
+ tokenize=True,
59
+ return_dict=True,
60
+ return_tensors="pt"
61
+ ).to(device)
62
+
63
+
64
+ # autoregressively complete prompt
65
+ output = model.generate(**inputs, max_new_tokens=100)
66
+ print(processor.decode(output[0], skip_special_tokens=True))
67
+ ```
68
+
69
+ ## Configurations
70
+ - The configuration info are in config.json.