Add tool calling template for HF format

#63

by Frrosta - opened 20 days ago

base: refs/heads/main

←

from: refs/pr/63

Discussion Files changed

-0

Frrosta

20 days ago

Using this template, one can serve the model in vLLM using the HF format and also use tool calling. For this to work, one first needs to save the jinja template from here to its own file (for example by loading this json in python and then dumping the content of the "chat_template" key to a new file) and then serve the model with the command:

vllm serve mistralai/Mistral-Small-3.1-24B-Instruct-2503 --chat-template <path-to-jinja-template> --tool-call-parser mistral --enable-auto-tool-choice

When calling the server, one needs to set the Sampling Parameter skip_special_tokens to False (see https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#id5), so that the mistral tool parser of vLLM can correctly parse the tool calls.

I was only able to test this using the unsloth BnB quantized version of the model as my GPU is too small but I presume this should work here as well.

Add tool calling template for HF formatef233032

itztheking

20 days ago

I tried setting skip_special_tokens to False but got the following error on vLLM:
skip_special_tokens=False is not supported for Mistral tokenizers.

Frrosta

20 days ago

If you use the mistral tokenizer, tool calling should work out of the box, as suggested in the example command in the model card:

vllm serve mistralai/Mistral-Small-3.1-24B-Instruct-2503 --tokenizer_mode mistral --config_format mistral --load_format mistral --tool-call-parser mistral --enable-auto-tool-choice --limit_mm_per_prompt 'image=10' --tensor-parallel-size 2

This chat template plus the suggested setting is only when the model is loaded in the huggingface format with the default tokenizer. I also tried loading the mistral tokenizer with the huggingface model, but I ran into some issues there (I don't recall precisely what though).

thies

4 days ago

worked for me using Mistral-Small-3.1-24B-Instruct-2503-FP8-dynamic. Thank you!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.