README.md · adamo1139/Yi-34b-200K-rawrr-v2-run-0902-LoRA at 54ae8cb6b8a855bd2fea6b2216b1365568115989

metadata

license: other
license_name: yi-license
license_link: LICENSE

About 961 steps total, Yi-34B-200K llamafied DPO trained for 1 epoch on rawrr_v2 dataset via unsloth at prompt length of 400 and max length of 700, lr 0.000045
Model initialized with max_positional_embeddings of 4096 to not OOM.
Training done on RTX 3090 Ti in about 14 hours.
Average mem usage was like 23.89 / 23.99 GiB, so very close to OOM at all times.
I trained it with XFCE on one 1080p monitor loaded up, on more fancy DM it would probably OOM with the same setup.
I am not sure what's the purpose of max_prompt_length being separate from max_length, so I may have used it wrong, I should read up on it.