Unexpected PixtralProcessor use with Mistral-Small-3.1 on vLLM β€” text-only use case

#61
by AbasKhan - opened

Hi team,

I'm encountering an issue when running the mistralai/Mistral-Small-3.1-24B-Instruct-2503 model on vLLM (latest version). Even though my usage is text-only, the engine is attempting to invoke PixtralProcessor with image inputs, which leads to the following error:
AttributeError: 'MistralTokenizer' object has no attribute 'init_kwargs' ... RuntimeError: Failed to apply PixtralProcessor on data={'text': '[IMG]', 'images': [...]}

🧠 Context

  • Model: mistralai/Mistral-Small-3.1-24B-Instruct-2503
  • Setup: vLLM inference with LLM(model=model_name, tokenizer_mode="mistral")
  • Task: Simple chat completion, no multimodal inputs
  • Goal: Just use the model for basic conversational text

πŸ’‘ Observations

  1. When tokenizer_mode="mistral" is set, vLLM seems to treat it like the Pixtral (multimodal) variant.
  2. The issue disappears if we use tokenizer_mode="auto"
  3. If this is the expected behaviour maybe the documentation should be be updated to make this more clear.
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment