saishshinde15 commited on
Commit
7136b54
·
verified ·
1 Parent(s): b295001

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +63 -0
README.md CHANGED
@@ -20,3 +20,66 @@ pipeline_tag: image-text-to-text
20
  - **Developed by:** saishshinde15
21
  - **License:** apache-2.0
22
  - **Finetuned from model :**llama-3.2-11b-vision-instruct
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  - **Developed by:** saishshinde15
21
  - **License:** apache-2.0
22
  - **Finetuned from model :**llama-3.2-11b-vision-instruct
23
+
24
+ ## How to use this model
25
+ - use unsolth for faster model download and faster inference speed . Users can also use transformers module from hugging face
26
+
27
+ ```python
28
+ from unsloth import FastVisionModel
29
+ from PIL import Image
30
+ import requests
31
+ from transformers import TextStreamer
32
+
33
+ # Load the model and tokenizer
34
+ model, tokenizer = FastVisionModel.from_pretrained(
35
+ model_name="saishshinde15/VisionAI", # YOUR MODEL YOU USED FOR TRAINING
36
+ load_in_4bit=False # Set to False for 16bit LoRA
37
+ )
38
+
39
+ # Enable the model for inference
40
+ FastVisionModel.for_inference(model)
41
+
42
+ # Load the image from URL
43
+ url = 'your image url'
44
+ image = Image.open(requests.get(url, stream=True).raw)
45
+
46
+ # Define the instruction and user query
47
+ instruction = (
48
+ "You are an expert in answering questions related to the image provided: "
49
+ "Answer to the questions given by the user accurately by referring to the image."
50
+ )
51
+ query = "What is this image about?"
52
+
53
+ # Create the chat message structure
54
+ messages = [
55
+ {"role": "user", "content": [
56
+ {"type": "image"},
57
+ {"type": "text", "text": instruction},
58
+ {"type": "text", "text": query}
59
+ ]}
60
+ ]
61
+
62
+ # Generate input text using the tokenizer's chat template
63
+ input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
64
+
65
+ # Tokenize the inputs
66
+ inputs = tokenizer(
67
+ image,
68
+ input_text,
69
+ add_special_tokens=False,
70
+ return_tensors="pt",
71
+ ).to("cuda")
72
+
73
+ # Initialize the text streamer
74
+ text_streamer = TextStreamer(tokenizer, skip_prompt=True)
75
+
76
+ # Generate the response
77
+ _ = model.generate(
78
+ **inputs,
79
+ streamer=text_streamer,
80
+ max_new_tokens=128,
81
+ use_cache=True,
82
+ temperature=1.5,
83
+ min_p=0.1
84
+ )
85
+