--- library_name: transformers datasets: - lavita/ChatDoctor-HealthCareMagic-100k base_model: - google/gemma-2-2b-it --- # Model Card for Model ID ## Model Details ### Model Description This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. - **Developed by:** Arash Nicoomanesh - **Funded by [optional]:** [More Information Needed] - **Shared by [optional]:** [More Information Needed] - **Model type:** [More Information Needed] - **Language(s) (NLP):** [More Information Needed] - **License:** [More Information Needed] - **Finetuned from model [optional]:** google/gemma-2b-it ### Model Sources [optional] - **Repository:** [More Information Needed] - **Paper [optional]:** [More Information Needed] - **Demo [optional]:** [More Information Needed] ## Uses ### Direct Use [More Information Needed] ### Downstream Use [optional] [More Information Needed] ### Out-of-Scope Use [More Information Needed] ## Bias, Risks, and Limitations [More Information Needed] ### Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. ## How to Get Started with the Model Use the code below to get started with the model. [More Information Needed] ## Training Details ### Training Data [More Information Needed] ### Training Procedure model = Gemma2ForCausalLM.from_pretrained( # Changed here base_model, quantization_config=bnb_config, device_map="auto", attn_implementation=attn_implementation ) tokenizer = GemmaTokenizerFast.from_pretrained(base_model, padding_side="right", truncation_side="right", trust_remote_code=True) #### Preprocessing [optional] dataset = load_dataset(dataset_name, split="all", cache_dir="./cache") dataset = dataset.shuffle(seed=42).select(range(3000)) # Use 3k samples for a better demo # Define a cleaning function to remove unwanted artifacts def clean_text(text): # Remove URLs and any "Chat Doctor" or similar phrases text = re.sub(r'\b(?:www\.[^\s]+|http\S+)', '', text) # Remove URLs text = re.sub(r'\b(?:Chat Doctor(?:.com)?(?:.in)?|www\.(?:google|yahoo)\S*)', '', text) # Remove site names text = re.sub(r'\s+', ' ', text) # Collapse multiple spaces return text.strip() #### Training Hyperparameters training_args = TrainingArguments( output_dir=new_model, per_device_train_batch_size=1, per_device_eval_batch_size=1, gradient_accumulation_steps=2, optim="paged_adamw_32bit", num_train_epochs=1, eval_strategy="steps", eval_steps=200, save_steps=500, # Keep save_steps as 500 logging_steps=1, warmup_steps=10, logging_strategy="steps", learning_rate=2e-4, fp16=True, bf16=False, group_by_length=True, report_to="wandb", load_best_model_at_end=False # Disable loading best model at the end ) # Trainer with early stopping callback trainer = SFTTrainer( model=model, train_dataset=dataset["train"], eval_dataset=dataset["test"], peft_config=peft_config, max_seq_length=512, dataset_text_field="text", # Specify the text field in your dataset tokenizer=tokenizer, args=training_args, packing=False, ) #### Speeds, Sizes, Times [optional] [More Information Needed] ## Evaluation View run noble-hill-29 at: https://wandb.ai/anicomanesh/Fine-tune%20Gemma-2-2b-it%20on%20Medical%20Dataset/runs/06xd9vvz wandb: ⭐️ View project at: https://wandb.ai/anicomanesh/Fine-tune%20Gemma-2-2b-it%20on%20Medical%20Dat ### Testing Data, Factors & Metrics #### Testing Data [More Information Needed] #### Factors [More Information Needed] #### Metrics [More Information Needed] ### Results [More Information Needed] #### Summary ## Model Examination [optional] [More Information Needed] ## Environmental Impact Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). - **Hardware Type:** [More Information Needed] - **Hours used:** [More Information Needed] - **Cloud Provider:** [More Information Needed] - **Compute Region:** [More Information Needed] - **Carbon Emitted:** [More Information Needed] ## Technical Specifications [optional] ### Model Architecture and Objective [More Information Needed] ### Compute Infrastructure [More Information Needed] #### Hardware [More Information Needed] #### Software [More Information Needed] ## Citation [optional] **BibTeX:** [More Information Needed] **APA:** [More Information Needed] ## Glossary [optional] [More Information Needed] ## More Information [optional] [More Information Needed] ## Model Card Authors [optional] [More Information Needed] ## Model Card Contact [More Information Needed]