---
license: apache-2.0
tags:
- unsloth
- biology
- medical
- LoRA
- trl
- text-generation-inference
datasets:
- FreedomIntelligence/medical-o1-reasoning-SFT
language:
- en
widget:
  - text: |
      What is scoliosis?
      
    output:
      text: "🧑‍⚕️: Scolosis is defined as an unnatural curvature of the spine that occurs in a side-to-side pattern. It typically affects the thoracic and lumbar spine and causes the spinal column to appear more curved than usual. This condition is usually progressive, meaning it won't go away on its own. It's usually diagnosed by an X-ray or physical exam. While scoliosis can happen at any age, it is more common in adolescents, particularly girls. Treatment options include bracing, physical therapy, surgery, or a combination of these. Managing scoliosis symptoms can help with postural changes, pain management, breathing issues, among other things."


base_model:
- unsloth/mistral-7b-bnb-4bit
pipeline_tag: text-generation
library_name: adapter-transformers
---


# Mistral-7B Medical QA Model

A specialized medical question-answering model built on Mistral-7B and fine-tuned on the [FreedomIntelligence/medical-o1-reasoning-SFT](https://huggingface.co/datasets/FreedomIntelligence/medical-o1-reasoning-SFT) dataset.

## Model Description

This model is a LoRA adaptation of Mistral-7B, fine-tuned to provide accurate and informative answers to medical questions. It's optimized using Unsloth for efficient training and inference.

## Inference Instructions

To use this model:

```python
!pip install unsloth
```
```python
from unsloth import FastLanguageModel
import torch

# Define the Alpaca prompt template
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
{instruction}
### Input:
{input_text}
### Response:
{output}"""

# Load your model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="Subh775/mistral-7b-medical-o1-ft",
    max_seq_length=2048,
    load_in_4bit=True
)

# Enable optimized inference mode for faster generation
FastLanguageModel.for_inference(model)
```

```python
# Function to handle the chat loop with memory

def chat():
    print("Chat with mistral-7b-medical-o1-ft! Type '\\q' or 'quit' to stop.\n")

    chat_history = ""  # Store the conversation history

    while True:
        # Get user input
        user_input = input("➤ ")

        # Exit condition
        if user_input.lower() in ['\\q', 'quit']:
            print("\nExiting the chat. Goodbye 🩺👍!")
            print("✨" + "=" * 27 + "✨\n")
            break

        # Append the current input to chat history with instruction formatting
        prompt = alpaca_prompt.format(
            instruction="Please answer the following medical question.",
            input_text=user_input,
            output=""
        )
        chat_history += prompt + "\n"

        # Tokenize combined history and move to GPU
        inputs = tokenizer([chat_history], return_tensors="pt").to("cuda")

        # Generate output with configured parameters
        outputs = model.generate(
            **inputs,
            max_new_tokens=256,
            temperature=0.7,
            top_p=0.9,
            num_return_sequences=1,
            do_sample=True,
            no_repeat_ngram_size=2
        )

        # Decode and clean the model's response
        decoded_output = tokenizer.batch_decode(outputs, skip_special_tokens=True)
        clean_output = decoded_output[0].split('### Response:')[-1].strip()

        # Add the response to chat history
        chat_history += f": {clean_output}\n"

        # Display the response
        print(f"\n🧑‍⚕️: {clean_output}\n")

# Start the chat
chat()


```

## Training

This model was fine-tuned on the [FreedomIntelligence/medical-o1-reasoning-SFT](https://huggingface.co/datasets/FreedomIntelligence/medical-o1-reasoning-SFT) dataset, which contains approximately 50,000 high-quality medical question-answer pairs. The training used Unsloth for optimization and LoRA for parameter-efficient fine-tuning.

### Key Features
- **Base Model:** [unsloth/mistral-7b-bnb-4bit](https://huggingface.co/unsloth/mistral-7b-bnb-4bit)
- **Fine-Tuning Objective:** Adaptation for structured, step-by-step medical reasoning tasks.
- **Training Dataset:** 19,704 samples from **medical-o1-reasoning-SFT** dataset.
- **Tools Used:**
  - **Unsloth:** Accelerates training by 2x.
  - **4-bit Quantization:** Reduces model memory usage.
  - **LoRA Adapters:** Enables parameter-efficient fine-tuning.
- **Training Time:** 38 minutes, 57 seconds for 1 epoch.
- The step and Training loss for the last iteration are:
- **Step:** 60
- **Training Loss:** 1.160700

## Limitations

* This model provides general medical information and should not be used as a substitute for professional medical advice.
* The model's knowledge is limited to its training data and may not include the latest medical research.
* Not clinically validated and should not be used for diagnosis or treatment decisions.

## License

This model inherits the license from the base Mistral-7B model.

## Citations

```bibtex
@misc{mistral-7b-medical-o1-ft,
  author = {Subh775},
  title = {Mistral-7B Medical QA Model},
  year = {2025},
  publisher = {HuggingFace},
  journal = {HuggingFace Repository},
  howpublished = {\url{https://huggingface.co/Subh775/mistral-7b-medical-o1-ft}}
}
```