--- license: apache-2.0 tags: - unsloth - biology - medical - LoRA - trl - text-generation-inference datasets: - FreedomIntelligence/medical-o1-reasoning-SFT language: - en widget: - text: | What is scoliosis? output: text: "πŸ§‘β€βš•οΈ: Scolosis is defined as an unnatural curvature of the spine that occurs in a side-to-side pattern. It typically affects the thoracic and lumbar spine and causes the spinal column to appear more curved than usual. This condition is usually progressive, meaning it won't go away on its own. It's usually diagnosed by an X-ray or physical exam. While scoliosis can happen at any age, it is more common in adolescents, particularly girls. Treatment options include bracing, physical therapy, surgery, or a combination of these. Managing scoliosis symptoms can help with postural changes, pain management, breathing issues, among other things." base_model: - unsloth/mistral-7b-bnb-4bit pipeline_tag: text-generation library_name: adapter-transformers --- # Mistral-7B Medical QA Model A specialized medical question-answering model built on Mistral-7B and fine-tuned on the [FreedomIntelligence/medical-o1-reasoning-SFT](https://huggingface.co/datasets/FreedomIntelligence/medical-o1-reasoning-SFT) dataset. ## Model Description This model is a LoRA adaptation of Mistral-7B, fine-tuned to provide accurate and informative answers to medical questions. It's optimized using Unsloth for efficient training and inference. ## Inference Instructions To use this model: ```python !pip install unsloth ``` ```python from unsloth import FastLanguageModel import torch # Define the Alpaca prompt template alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. ### Instruction: {instruction} ### Input: {input_text} ### Response: {output}""" # Load your model model, tokenizer = FastLanguageModel.from_pretrained( model_name="Subh775/mistral-7b-medical-o1-ft", max_seq_length=2048, load_in_4bit=True ) # Enable optimized inference mode for faster generation FastLanguageModel.for_inference(model) ``` ```python # Function to handle the chat loop with memory def chat(): print("Chat with mistral-7b-medical-o1-ft! Type '\\q' or 'quit' to stop.\n") chat_history = "" # Store the conversation history while True: # Get user input user_input = input("➀ ") # Exit condition if user_input.lower() in ['\\q', 'quit']: print("\nExiting the chat. Goodbye πŸ©ΊπŸ‘!") print("✨" + "=" * 27 + "✨\n") break # Append the current input to chat history with instruction formatting prompt = alpaca_prompt.format( instruction="Please answer the following medical question.", input_text=user_input, output="" ) chat_history += prompt + "\n" # Tokenize combined history and move to GPU inputs = tokenizer([chat_history], return_tensors="pt").to("cuda") # Generate output with configured parameters outputs = model.generate( **inputs, max_new_tokens=256, temperature=0.7, top_p=0.9, num_return_sequences=1, do_sample=True, no_repeat_ngram_size=2 ) # Decode and clean the model's response decoded_output = tokenizer.batch_decode(outputs, skip_special_tokens=True) clean_output = decoded_output[0].split('### Response:')[-1].strip() # Add the response to chat history chat_history += f": {clean_output}\n" # Display the response print(f"\nπŸ§‘β€βš•οΈ: {clean_output}\n") # Start the chat chat() ``` ## Training This model was fine-tuned on the [FreedomIntelligence/medical-o1-reasoning-SFT](https://huggingface.co/datasets/FreedomIntelligence/medical-o1-reasoning-SFT) dataset, which contains approximately 50,000 high-quality medical question-answer pairs. The training used Unsloth for optimization and LoRA for parameter-efficient fine-tuning. ### Key Features - **Base Model:** [unsloth/mistral-7b-bnb-4bit](https://huggingface.co/unsloth/mistral-7b-bnb-4bit) - **Fine-Tuning Objective:** Adaptation for structured, step-by-step medical reasoning tasks. - **Training Dataset:** 19,704 samples from **medical-o1-reasoning-SFT** dataset. - **Tools Used:** - **Unsloth:** Accelerates training by 2x. - **4-bit Quantization:** Reduces model memory usage. - **LoRA Adapters:** Enables parameter-efficient fine-tuning. - **Training Time:** 38 minutes, 57 seconds for 1 epoch. - The step and Training loss for the last iteration are: - **Step:** 60 - **Training Loss:** 1.160700 ## Limitations * This model provides general medical information and should not be used as a substitute for professional medical advice. * The model's knowledge is limited to its training data and may not include the latest medical research. * Not clinically validated and should not be used for diagnosis or treatment decisions. ## License This model inherits the license from the base Mistral-7B model. ## Citations ```bibtex @misc{mistral-7b-medical-o1-ft, author = {Subh775}, title = {Mistral-7B Medical QA Model}, year = {2025}, publisher = {HuggingFace}, journal = {HuggingFace Repository}, howpublished = {\url{https://huggingface.co/Subh775/mistral-7b-medical-o1-ft}} } ```