Dakitari-Instruct Medical Language Model
A specialized language model for medical text generation and question answering, trained on PubMed abstracts and medical QA datasets.
Model Description
- Model Type: Transformer-based Language Model
- Language: English
- License: MIT
- Training Data: PubMed abstracts + Medical QA pairs
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("path/to/dakitari-instruct")
model = AutoModelForCausalLM.from_pretrained("path/to/dakitari-instruct")
Training Data
This model is trained on:
- PubMed abstracts (biomedical literature)
- Medical question-answer pairs
For detailed dataset information, see dataset_card.md.
Features
- Medical text generation
- Medical question answering
- Research assistance
- Built on transformer architecture
- Trained on PubMed abstracts and medical Q&A datasets
Requirements
- Python 3.8+
- TensorFlow 2.13+
- Transformers library
- Other dependencies listed in
requirements.txt
Installation
- Clone the repository:
git clone https://github.com/elijahnzeli1/dakitari-instruct.git
cd dakitari-instruct
- Create a virtual environment (recommended):
python -m venv venv
#for windows
.\venv\Scripts\Activate
#for mac/linux
source venv/bin/activate
- Install dependencies:
pip install -r requirements.txt
Training the Model
- Start training:
python train.py --batch_size 32 --epochs 10
- Monitor training progress with TensorBoard:
tensorboard --logdir logs
Then open your browser and navigate to http://localhost:6006
to view training metrics.
Training logs are also saved in CSV format in the logs
directory for backup.
Training Parameters
You can customize the training by adjusting these parameters:
--batch_size
: Batch size for training (default: 32)--epochs
: Number of training epochs (default: 10)--max_length
: Maximum sequence length (default: 512)--embed_dim
: Embedding dimension (default: 256)--num_heads
: Number of attention heads (default: 8)--ff_dim
: Feed-forward dimension (default: 512)--num_transformer_blocks
: Number of transformer blocks (default: 6)--dropout_rate
: Dropout rate (default: 0.1)--learning_rate
: Learning rate (default: 1e-4)--checkpoint_dir
: Directory to save model checkpoints (default: "checkpoints")--log_dir
: Directory to save training logs (default: "logs")
Project Structure
dakitari-instruct/
βββ data/
β βββ preprocess.py # Data preprocessing utilities
βββ model/
β βββ transformer_model.py # Model architecture
βββ train.py # Training script
βββ requirements.txt # Project dependencies
βββ README.md # This file
Hardware Requirements
- Minimum: 16GB RAM, NVIDIA GPU with 8GB VRAM
- Recommended: 32GB RAM, NVIDIA GPU with 16GB+ VRAM
Dataset Sources
The model is trained on:
- PubMed abstracts
- Medical questions and answers pairs
Contributing
- Fork the repository
- Create your feature branch
- Commit your changes
- Push to the branch
- Create a new Pull Request
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- Thanks to the HuggingFace team for the transformers library
- Thanks to the TensorFlow team for the excellent framework
- Thanks to the medical community for the valuable datasets
Now you can track your training progress locally using TensorBoard, which provides:
- Real-time metrics visualization
- Learning rate tracking
- Model graph visualization
- Histogram of weights and biases
The training metrics are also saved in CSV format as a backup, which you can analyze using any spreadsheet software or data analysis tools. To view the training progress:
- Start training your model
- In a separate terminal, run tensorboard --logdir logs
- Open http://localhost:6006 in your browser
- Downloads last month
- 0
Model tree for Qybera/dakitari-instruct-v2-advanced
Base model
Qybera/dakitari-instruct-v1.0