Update README.md
Browse files
README.md
CHANGED
@@ -40,9 +40,14 @@ The following checkpoints are from our paper titled Goldfish Loss: Mitigating Me
|
|
40 |
- The control model differs only in the fact that it did not utilize the canaries dataset for memorization and was simply pre-trained on 20B Redpajama tokens.
|
41 |
- The Canaries dataset, which contains 2000 Wikidocs, is repeated 50 times throughout the pre-training. Thus, it contains around ~204M tokens in total (including padding).
|
42 |
|
|
|
|
|
|
|
|
|
|
|
43 |
# Cite our work
|
44 |
|
45 |
-
If you find our
|
46 |
|
47 |
```bibtex
|
48 |
@misc{hans2024like,
|
|
|
40 |
- The control model differs only in the fact that it did not utilize the canaries dataset for memorization and was simply pre-trained on 20B Redpajama tokens.
|
41 |
- The Canaries dataset, which contains 2000 Wikidocs, is repeated 50 times throughout the pre-training. Thus, it contains around ~204M tokens in total (including padding).
|
42 |
|
43 |
+
# Technical Specification
|
44 |
+
|
45 |
+
Each checkpoint mentioned above used randomly initialized [TinyLLaMA-1.1B](https://huggingface.co/TinyLlama/TinyLlama_v1.1) architecture.
|
46 |
+
For pretraining details, please find check our [GitHub](https://github.com/ahans30/goldfish-loss) repository.
|
47 |
+
|
48 |
# Cite our work
|
49 |
|
50 |
+
If you find our model, codebase or dataset beneficial, please consider citing our work:
|
51 |
|
52 |
```bibtex
|
53 |
@misc{hans2024like,
|