Qwen2.5-3B-R1-MedicalReasoner LoRA Adapter

This repository contains the LoRA adapter weights and configuration for Qwen2.5-3B-R1-MedicalReasoner, a state-of-the-art clinical reasoning language model fine-tuned using GRPO. The adapter is designed to further optimize and customize model behavior for clinical reasoning tasks.

Overview

  • Adapter Name: Qwen2.5-3B-R1-MedicalReasoner LoRA Adapter
  • Purpose: To modify and enhance the base model (Qwen2.5-3B-R1-MedicalReasoner) using Low-Rank Adaptation (LoRA) techniques without modifying the full model weights.
  • Use Case: Ideal for users wishing to fine-tune, experiment, or deploy the clinical reasoning model with customized parameter-efficient adaptations.

Key Features

  • Parameter-Efficient Adaptation: LoRA allows for training a small number of additional parameters, making further fine-tuning efficient in time and resources.
  • Seamless Integration: Easily integrated with the base model using the provided tools and functions in Unsloth and vLLM.
  • Optimized for Clinical Reasoning: The adapter reinforces chain-of-thought generation and improves the clarity of diagnostic reasoning outputs.

How to Use

Integration with Base Model

To download and load the LoRA adapter into Qwen2.5-3B-R1-MedicalReasoner:

from huggingface_hub import snapshot_download
from unsloth import FastLanguageModel

# Download the adapter weights:
lora_path = snapshot_download("iimran/Qwen2.5-3B-R1-MedicalReasoner-lora-adapter")
print("LoRA adapter downloaded to:", lora_path)

# Load base model:
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="iimran/Qwen2.5-3B-R1-MedicalReasoner",
    load_in_4bit=False,
    fast_inference=True
)

# Load the LoRA adapter:
model.load_lora(lora_path)

Fine-Tuning and Experimentation

This adapter was originally developed and fine-tuned using GRPO with customized reward functions to enhance chain-of-thought reasoning. Researchers who wish to further optimize the behavior of the clinical reasoning model with targeted adaptations can start from these adapter weights.

Installation Requirements

  • Python Version: 3.8 or higher
  • Dependencies:
    • unsloth
    • vLLM
    • huggingface-hub
    • Other dependencies required by the base model and LoRA integration

Install the required packages using pip:

pip install unsloth vllm huggingface-hub

Citation

If you use the LoRA adapter in your work, please cite:

@misc{Qwen2.5-3B-R1-MedicalReasoner-lora-adapter,
  authors = {Imran Sarwar, Muhammad Rouf Mustafa},
  title = {Qwen2.5-3B-R1-MedicalReasoner LoRA Adapter},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/iimran/Qwen2.5-3B-R1-MedicalReasoner-lora-adapter}
}

Contributing

Contributions to the LoRA adapter are welcome. If you have improvements for:

  • Adapter performance or efficiency
  • Documentation updates
  • Additional experiments or fine-tuning strategies

Please open an issue or submit a pull request.

Disclaimer

This LoRA adapter is provided for research and educational purposes. It is intended to be used in combination with the Qwen2.5-3B-R1-MedicalReasoner base model. As with the base model, clinical outputs should be validated by qualified healthcare professionals before being used in any medical decision-making.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for iimran/Qwen2.5-3B-R1-MedicalReasoner-lora-adapter

Base model

Qwen/Qwen2.5-3B
Finetuned
(132)
this model

Dataset used to train iimran/Qwen2.5-3B-R1-MedicalReasoner-lora-adapter