tomg-group-umd
/

3-goldfish-loss-llama-1B

Text2Text Generation

text-generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ahans1 commited on Aug 19, 2024

Commit

fccec1f

·

verified ·

1 Parent(s): fd48d85

Update README.md

Files changed (1) hide show

README.md +4 -0

README.md CHANGED Viewed

@@ -17,6 +17,10 @@ pipeline_tag: text2text-generation
 # Goldfish Loss
 We introduce goldfish loss, a new language modeling loss function that mitigates memorization of training data.
 Specifically, goldfish loss pseudorandomly drops $1/k$ of total tokens seen (in the forward pass) during loss computation (i.e., it doesn't compute loss for these tokens), with k being a hyperparameter.
 We show that the model finds it increasingly difficult to verbatim regurgitate training data even after 100 epochs. Please read our paper linked below for more details.

 # Goldfish Loss
+<div align="center">
+  <img src="https://raw.githubusercontent.com/ahans30/goldfish-loss/main/assets/goldfish-loss.jpg" width="300"/>
+</div>
 We introduce goldfish loss, a new language modeling loss function that mitigates memorization of training data.
 Specifically, goldfish loss pseudorandomly drops $1/k$ of total tokens seen (in the forward pass) during loss computation (i.e., it doesn't compute loss for these tokens), with k being a hyperparameter.
 We show that the model finds it increasingly difficult to verbatim regurgitate training data even after 100 epochs. Please read our paper linked below for more details.