LLaVA-Med v1.5 (based on Mistral-7B-Instruct-v0.2)

LLaVA-Med (Large Language and Vision Assistant for bioMedicine) is an open-source large vision-language model adapted for biomedical applications. Built upon LLaVA and enhanced through curriculum learning, LLaVA-Med is fine-tuned specifically for open-ended biomedical question answering tasks.

This release aims to support research reproducibility for the corresponding paper, which demonstrates improved performance on biomedical VQA benchmarks such as PathVQA and VQA-RAD.

πŸ“Œ Note: For original model weights, refer to microsoft/llava-med-v1.5-mistral-7b.


πŸ”¬ Experimental Usage in Libra's repo

This model checkpoint is intended for experimental use and can be tested directly within the Libra repository.

Key Modification

To enable the re-trained vision encoder during inference, ensure the following configuration is applied:

"unfreeze_mm_vision_tower": true

πŸ“š Learn More

For a deeper dive into the methodology, theoretical insights, and performance benchmarks of the Libra framework, please see the following resources:


License

mistralai/Mistral-7B-Instruct-v0.2

Downloads last month
174
Safetensors
Model size
7.57B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for X-iZhang/libra-llava-med-v1.5-mistral-7b

Finetuned
(3)
this model