Uploaded finetuned model
- Developed by: themex1380
- License: apache-2.0
- Finetuned from model : CohereLabs/aya-expanse-8b
This cohere model was trained 2x faster with Unsloth and Huggingface's TRL library.

It took around 2h, with an A40 48GB, the training cost being 3h * 0.4$/h ~= 1.2$, and around 0.5$ of messing around, and getting a feel for the parameteres, so the total was ~2.0$.
Parameters:
- Rank: 24
- Rank Alpha: 48
- Per Device Train Batch Size: 2
- Grad Accum Steps: 2
- Warmup Ratio: 0.05
- Num Train Epochs: 1
- Learning Rate: 8e-5
- Optimizer: adamw_8bit
- Weight Decay: 0.01
- Max Grad Norm: 3.0
- Lr Scheduler Type: Linear
Runpod Machine:
- GPU: 1 x A40 48GB PCIE
- Container Disk: 100GB
- Volume Disk: 20GB
- Docker Image: runpod/pytorch:2.2.0-py3.10-cuda12.1.1-devel-ubuntu22.04