Model Card for Model ID
summary
This is a finetuned version of Qwen2.5-VL-3B-Instruct, focusing on the task img2latex.
The model is finetuned on the dataset OleehyO/latex-formulas with 2 epochs to enhance latex ocr capability, and one epoch on linxy/LaTeX-OCR to regulate the model's output.
This work is inspired by prithivMLmods/Qwen2-VL-OCR-2B-Instruct.
evaluation
model | metric | value |
---|---|---|
prithivMLmods/Qwen2-VL-OCR-2B-Instruct (bf16) | rouge-l: f1-score | 0.88 |
CER | 0.24 | |
etherealgemini/Qwen2_5-VL-OCR-3B-Instruct (bf16) | rouge-l: f1-score | 0.91 |
CER | 0.21 | |
The improvement probably comes from:
- model's upgrade, for sure...?
- larger dataset: 100K -> 550K
There is an even MUCH larger dataset OleehyO/latex-formulas-80M, but my computing resources are limited.
- Downloads last month
- 19
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for BooMarshmello/Qwen2.5-VL-OCR-3B-Instruct
Base model
Qwen/Qwen2.5-VL-3B-Instruct