StepLaw
/

StepLaw-N_1.0B-D_1.0B-LR1.381e-03-BS524288

@@ -27,13 +27,13 @@ This model is part of the [StepLaw-N_1.0B-D_1.0B](https://huggingface.co/collect
 ### Training Parameters
 - **Learning rate (lr)**: 1.381e-03
-- **Batch size (bs)**: 256
 - **Training iterations**: 3814
 - **Training tokens (D)**: 2.0B
 ## Model Description
-StepLaw models are trained with various hyperparameter settings to enable research on scaling laws and hyperparameter optimization. This specific model was trained with learning rate 1.381e-03 and batch size 256 for 3814 iterations, using a total of 2.0B training tokens.
 ## Usage Example

 ### Training Parameters
 - **Learning rate (lr)**: 1.381e-03
+- **Batch size (bs)**: 524288
 - **Training iterations**: 3814
 - **Training tokens (D)**: 2.0B
 ## Model Description
+StepLaw models are trained with various hyperparameter settings to enable research on scaling laws and hyperparameter optimization. This specific model was trained with learning rate 1.381e-03 and batch size 524288 for 3814 iterations, using a total of 2.0B training tokens.
 ## Usage Example