Unexpected PixtralProcessor use with Mistral-Small-3.1 on vLLM β text-only use case
#61
by
AbasKhan
- opened
Hi team,
I'm encountering an issue when running the mistralai/Mistral-Small-3.1-24B-Instruct-2503
model on vLLM (latest version). Even though my usage is text-only, the engine is attempting to invoke PixtralProcessor
with image inputs, which leads to the following error:AttributeError: 'MistralTokenizer' object has no attribute 'init_kwargs' ... RuntimeError: Failed to apply PixtralProcessor on data={'text': '[IMG]', 'images': [...]}
π§ Context
- Model:
mistralai/Mistral-Small-3.1-24B-Instruct-2503
- Setup: vLLM inference with
LLM(model=model_name, tokenizer_mode="mistral")
- Task: Simple chat completion, no multimodal inputs
- Goal: Just use the model for basic conversational text
π‘ Observations
- When
tokenizer_mode="mistral"
is set, vLLM seems to treat it like the Pixtral (multimodal) variant. - The issue disappears if we use
tokenizer_mode="auto"
- If this is the expected behaviour maybe the documentation should be be updated to make this more clear.