Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
@@ -27,13 +27,13 @@ This model is part of the [StepLaw-N_1.0B-D_19.0B](https://huggingface.co/collec
|
|
27 |
|
28 |
### Training Parameters
|
29 |
- **Learning rate (lr)**: 1.105e-02
|
30 |
-
- **Batch size (bs)**:
|
31 |
- **Training iterations**: 9536
|
32 |
- **Training tokens (D)**: 20.0B
|
33 |
|
34 |
## Model Description
|
35 |
|
36 |
-
StepLaw models are trained with various hyperparameter settings to enable research on scaling laws and hyperparameter optimization. This specific model was trained with learning rate 1.105e-02 and batch size
|
37 |
|
38 |
## Usage Example
|
39 |
|
|
|
27 |
|
28 |
### Training Parameters
|
29 |
- **Learning rate (lr)**: 1.105e-02
|
30 |
+
- **Batch size (bs)**: 2097152
|
31 |
- **Training iterations**: 9536
|
32 |
- **Training tokens (D)**: 20.0B
|
33 |
|
34 |
## Model Description
|
35 |
|
36 |
+
StepLaw models are trained with various hyperparameter settings to enable research on scaling laws and hyperparameter optimization. This specific model was trained with learning rate 1.105e-02 and batch size 2097152 for 9536 iterations, using a total of 20.0B training tokens.
|
37 |
|
38 |
## Usage Example
|
39 |
|