Can't reproduce given example (no meaningful output)
#8
by
pzarzycki
- opened
I tried the model using provided example, but it doesn't generate any meaningfull response:
model_id = "llava-hf/llava-onevision-qwen2-0.5b-ov-hf"
...
inputs = (processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt")
.to('cuda'))
output = model.generate(**inputs, max_new_tokens=250)
print(processor.decode(output[0][2:], skip_special_tokens=True))
What is shown in this image? Name all the objects!assistant
The object is the object is the object is the object is the object is the object is the object is the object is the object ....
Hey! I tried with the demo code on main
branch and it worked for me. Which version of transformers you are using? Can you run generation by applying chat template in tokenize=False
mode and then calling the processor directly?
Code that I used to check:
import requests
from PIL import Image
import torch
from transformers import AutoProcessor, LlavaOnevisionForConditionalGeneration
model_id = "llava-hf/llava-onevision-qwen2-0.5b-ov-hf"
model = LlavaOnevisionForConditionalGeneration.from_pretrained(
model_id,
torch_dtype=torch.float16,
low_cpu_mem_usage=True,
).to(0)
processor = AutoProcessor.from_pretrained(model_id)
conversation = [
{
"role": "user",
"content": [
{"type": "text", "text": "What are these?"},
{"type": "image", "url": "http://images.cocodataset.org/val2017/000000039769.jpg"},
],
},
]
inputs = processor.apply_chat_template(
conversation,
add_generation_prompt=True,
tokenize=True,
return_tensors="pt",
return_dict=True,
).to(0, torch.float16)
output = model.generate(**inputs, max_new_tokens=200, do_sample=False)
print(processor.decode(output[0][2:], skip_special_tokens=True))