Qwen2.5-3B-R1-MedicalReasoner LoRA Adapter
This repository contains the LoRA adapter weights and configuration for Qwen2.5-3B-R1-MedicalReasoner, a state-of-the-art clinical reasoning language model fine-tuned using GRPO. The adapter is designed to further optimize and customize model behavior for clinical reasoning tasks.
Overview
- Adapter Name: Qwen2.5-3B-R1-MedicalReasoner LoRA Adapter
- Purpose: To modify and enhance the base model (Qwen2.5-3B-R1-MedicalReasoner) using Low-Rank Adaptation (LoRA) techniques without modifying the full model weights.
- Use Case: Ideal for users wishing to fine-tune, experiment, or deploy the clinical reasoning model with customized parameter-efficient adaptations.
Key Features
- Parameter-Efficient Adaptation: LoRA allows for training a small number of additional parameters, making further fine-tuning efficient in time and resources.
- Seamless Integration: Easily integrated with the base model using the provided tools and functions in Unsloth and vLLM.
- Optimized for Clinical Reasoning: The adapter reinforces chain-of-thought generation and improves the clarity of diagnostic reasoning outputs.
How to Use
Integration with Base Model
To download and load the LoRA adapter into Qwen2.5-3B-R1-MedicalReasoner:
from huggingface_hub import snapshot_download
from unsloth import FastLanguageModel
# Download the adapter weights:
lora_path = snapshot_download("iimran/Qwen2.5-3B-R1-MedicalReasoner-lora-adapter")
print("LoRA adapter downloaded to:", lora_path)
# Load base model:
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="iimran/Qwen2.5-3B-R1-MedicalReasoner",
load_in_4bit=False,
fast_inference=True
)
# Load the LoRA adapter:
model.load_lora(lora_path)
Fine-Tuning and Experimentation
This adapter was originally developed and fine-tuned using GRPO with customized reward functions to enhance chain-of-thought reasoning. Researchers who wish to further optimize the behavior of the clinical reasoning model with targeted adaptations can start from these adapter weights.
Installation Requirements
- Python Version: 3.8 or higher
- Dependencies:
- unsloth
- vLLM
- huggingface-hub
- Other dependencies required by the base model and LoRA integration
Install the required packages using pip:
pip install unsloth vllm huggingface-hub
Citation
If you use the LoRA adapter in your work, please cite:
@misc{Qwen2.5-3B-R1-MedicalReasoner-lora-adapter,
authors = {Imran Sarwar, Muhammad Rouf Mustafa},
title = {Qwen2.5-3B-R1-MedicalReasoner LoRA Adapter},
year = {2025},
publisher = {Hugging Face},
url = {https://huggingface.co/iimran/Qwen2.5-3B-R1-MedicalReasoner-lora-adapter}
}
Contributing
Contributions to the LoRA adapter are welcome. If you have improvements for:
- Adapter performance or efficiency
- Documentation updates
- Additional experiments or fine-tuning strategies
Please open an issue or submit a pull request.
Disclaimer
This LoRA adapter is provided for research and educational purposes. It is intended to be used in combination with the Qwen2.5-3B-R1-MedicalReasoner base model. As with the base model, clinical outputs should be validated by qualified healthcare professionals before being used in any medical decision-making.
Model tree for iimran/Qwen2.5-3B-R1-MedicalReasoner-lora-adapter
Base model
Qwen/Qwen2.5-3B