Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
@@ -27,13 +27,13 @@ This model is part of the [StepLaw-N_1.0B-D_1.0B](https://huggingface.co/collect
|
|
27 |
|
28 |
### Training Parameters
|
29 |
- **Learning rate (lr)**: 1.381e-03
|
30 |
-
- **Batch size (bs)**:
|
31 |
- **Training iterations**: 3814
|
32 |
- **Training tokens (D)**: 2.0B
|
33 |
|
34 |
## Model Description
|
35 |
|
36 |
-
StepLaw models are trained with various hyperparameter settings to enable research on scaling laws and hyperparameter optimization. This specific model was trained with learning rate 1.381e-03 and batch size
|
37 |
|
38 |
## Usage Example
|
39 |
|
|
|
27 |
|
28 |
### Training Parameters
|
29 |
- **Learning rate (lr)**: 1.381e-03
|
30 |
+
- **Batch size (bs)**: 524288
|
31 |
- **Training iterations**: 3814
|
32 |
- **Training tokens (D)**: 2.0B
|
33 |
|
34 |
## Model Description
|
35 |
|
36 |
+
StepLaw models are trained with various hyperparameter settings to enable research on scaling laws and hyperparameter optimization. This specific model was trained with learning rate 1.381e-03 and batch size 524288 for 3814 iterations, using a total of 2.0B training tokens.
|
37 |
|
38 |
## Usage Example
|
39 |
|