ahans1's picture
Update README.md
a8697c7 verified
|
raw
history blame
3.19 kB
metadata
library_name: transformers
tags:
  - goldfish-loss
  - memorization
  - mitigation
license: apache-2.0
language:
  - en
pipeline_tag: text2text-generation

Overview

The following checkpoints are from our paper titled Goldfish Loss: Mitigating Memorization in Generative LLMs [paper link].

Checkpoint Name k-GL Token Drop Strategy Pretrain Tokens Primary Dataset Canaries for Memorization
(repeated 50 times)
tomg-group-umd/3-goldfish-loss-llama-1B 3 Hash (width = 13) 20B Redpajama Wikipedia
tomg-group-umd/4-goldfish-loss-llama-1B 4 Hash (width = 13) 20B Redpajama Wikipedia
tomg-group-umd/8-goldfish-loss-llama-1B 8 Hash (width = 13) 20B Redpajama Wikipedia
tomg-group-umd/32-goldfish-loss-llama-1B 32 Hash (width = 13) 20B Redpajama Wikipedia
tomg-group-umd/128-goldfish-loss-llama-1B 128 Hash (width = 13) 20B Redpajama Wikipedia
tomg-group-umd/control-llama-1B - No Tokens Dropped 20B Redpajama None
tomg-group-umd/standard-loss-llama-1B - No Tokens Dropped 20B Redpajama Wikipedia
  • standard-loss-llama-1B and control-llama-1B are trained with standard causal language modelling loss with same exact specs as goldfish models.
  • Control model only differ in that it did NOT have canaries dataset used for memorized and simply pretrained on 20B Redpajama tokens.

Quick Links