Update README.md
Browse files
README.md
CHANGED
@@ -35,12 +35,40 @@ extra_gated_description: >-
|
|
35 |
pipeline_tag: image-text-to-text
|
36 |
---
|
37 |
|
38 |
-
Checkpoint of Mistral-Small-3.1-24B-Instruct-2503 with FP8 per-tensor quantization in the Mistral-format.
|
|
|
|
|
39 |
```
|
40 |
vllm serve nm-testing/Mistral-Small-3.1-24B-Instruct-2503-FP8 --tokenizer_mode mistral --config_format mistral --load_format mistral --tool-call-parser mistral --enable-auto-tool-choice --limit_mm_per_prompt 'image=10'
|
41 |
```
|
42 |
|
43 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
44 |
|
45 |
Building upon Mistral Small 3 (2501), Mistral Small 3.1 (2503) **adds state-of-the-art vision understanding** and enhances **long context capabilities up to 128k tokens** without compromising text performance.
|
46 |
With 24 billion parameters, this model achieves top-tier capabilities in both text and vision tasks.
|
|
|
35 |
pipeline_tag: image-text-to-text
|
36 |
---
|
37 |
|
38 |
+
Checkpoint of Mistral-Small-3.1-24B-Instruct-2503 with FP8 per-tensor quantization in the Mistral-format.
|
39 |
+
|
40 |
+
Please run with vLLM like so:
|
41 |
```
|
42 |
vllm serve nm-testing/Mistral-Small-3.1-24B-Instruct-2503-FP8 --tokenizer_mode mistral --config_format mistral --load_format mistral --tool-call-parser mistral --enable-auto-tool-choice --limit_mm_per_prompt 'image=10'
|
43 |
```
|
44 |
|
45 |
+
Evaluations against the unquantized baseline on ChartQA:
|
46 |
+
```
|
47 |
+
vllm serve mistralai/Mistral-Small-3.1-24B-Instruct-2503 --tokenizer_mode mistral --config_format mistral --load_format mistral
|
48 |
+
python -m eval.run eval_vllm --model_name mistralai/Mistral-Small-3.1-24B-Instruct-2503 --url http://0.0.0.0:8000 --output_dir output/ --eval_name "chartqa"
|
49 |
+
Querying model: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 2500/2500 [07:37<00:00, 5.47it/s]
|
50 |
+
================================================================================
|
51 |
+
Metrics:
|
52 |
+
{
|
53 |
+
"explicit_prompt_relaxed_correctness": 0.8604,
|
54 |
+
"anywhere_in_answer_relaxed_correctness": 0.8604
|
55 |
+
}
|
56 |
+
================================================================================
|
57 |
+
|
58 |
+
vllm serve nm-testing/Mistral-Small-3.1-24B-Instruct-2503-FP8 --tokenizer_mode mistral --config_format mistral --load_format mistral
|
59 |
+
python -m eval.run eval_vllm --model_name nm-testing/Mistral-Small-3.1-24B-Instruct-2503-FP8 --url http://0.0.0.0:8000 --output_dir output/ --eval_name "chartqa"
|
60 |
+
Querying model: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 2500/2500 [06:37<00:00, 6.28it/s]
|
61 |
+
================================================================================
|
62 |
+
Metrics:
|
63 |
+
{
|
64 |
+
"explicit_prompt_relaxed_correctness": 0.8596,
|
65 |
+
"anywhere_in_answer_relaxed_correctness": 0.86
|
66 |
+
}
|
67 |
+
================================================================================
|
68 |
+
```
|
69 |
+
|
70 |
+
|
71 |
+
# Original Model Card for Mistral-Small-3.1-24B-Instruct-2503
|
72 |
|
73 |
Building upon Mistral Small 3 (2501), Mistral Small 3.1 (2503) **adds state-of-the-art vision understanding** and enhances **long context capabilities up to 128k tokens** without compromising text performance.
|
74 |
With 24 billion parameters, this model achieves top-tier capabilities in both text and vision tasks.
|