Branches step1200000-tokens5033B and fp32 do not exist
#1
by
deltanym
- opened
As you say in the model card, step1200000-tokens5033B should be the pre-annealing base model branch, however, it is not present in this repo. fp32 is also not there.
Yes I think the branch information is copied from the older model card and does not fully apply here maybe @soldni knows what needs to be changed in this model card?
I see - it's also not really documented how -0125 differs from -0924, at least, not here or anywhere I can find much detail - could you explain that, or is it just somewhere I didn't see?
The paper has some documentation on it in the Appendix: https://arxiv.org/abs/2409.02060
oh, thanks! I didn't find the updated version
so, the pre-train section is the same, just annealed / mid-trained on different data?