Model Card for Model ID

summary

This is a finetuned version of Qwen2.5-VL-3B-Instruct, focusing on the task img2latex.

The model is finetuned on the dataset OleehyO/latex-formulas with 2 epochs to enhance latex ocr capability, and one epoch on linxy/LaTeX-OCR to regulate the model's output.

This work is inspired by prithivMLmods/Qwen2-VL-OCR-2B-Instruct.

evaluation

model	metric	value
prithivMLmods/Qwen2-VL-OCR-2B-Instruct (bf16)	rouge-l: f1-score	0.88
	CER	0.24
etherealgemini/Qwen2_5-VL-OCR-3B-Instruct (bf16)	rouge-l: f1-score	0.91
	CER	0.21

The improvement probably comes from:

model's upgrade, for sure...?
larger dataset: 100K -> 550K

There is an even MUCH larger dataset OleehyO/latex-formulas-80M, but my computing resources are limited.

BooMarshmello
/

Qwen2.5-VL-OCR-3B-Instruct

Model Card for Model ID

summary

evaluation

Model tree for BooMarshmello/Qwen2.5-VL-OCR-3B-Instruct

Datasets used to train BooMarshmello/Qwen2.5-VL-OCR-3B-Instruct