radm
/

PEFT
Safetensors
llama-factory
lora
Generated from Trainer
Qwen2.5-32B-simpo-LoRA / all_results.json
radm's picture
first model version
e8a12f2
raw
history blame contribute delete
220 Bytes
{
"epoch": 0.9998441639395356,
"total_flos": 5.68672318443248e+18,
"train_loss": 2.14272621887599,
"train_runtime": 89392.3847,
"train_samples_per_second": 0.144,
"train_steps_per_second": 0.004
}