Model Card for Model ID

summary

This is a finetuned version of Qwen2.5-VL-3B-Instruct, focusing on the task img2latex.

The model is finetuned on the dataset OleehyO/latex-formulas with 2 epochs to enhance latex ocr capability, and one epoch on linxy/LaTeX-OCR to regulate the model's output.

This work is inspired by prithivMLmods/Qwen2-VL-OCR-2B-Instruct.

evaluation

model metric value
prithivMLmods/Qwen2-VL-OCR-2B-Instruct (bf16) rouge-l: f1-score 0.88
CER 0.24
etherealgemini/Qwen2_5-VL-OCR-3B-Instruct (bf16) rouge-l: f1-score 0.91
CER 0.21

The improvement probably comes from:

  1. model's upgrade, for sure...?
  2. larger dataset: 100K -> 550K

There is an even MUCH larger dataset OleehyO/latex-formulas-80M, but my computing resources are limited.

Downloads last month
19
Safetensors
Model size
3.75B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for BooMarshmello/Qwen2.5-VL-OCR-3B-Instruct

Finetuned
(147)
this model

Datasets used to train BooMarshmello/Qwen2.5-VL-OCR-3B-Instruct