--- license: apache-2.0 datasets: - EleutherAI/pile language: - en tags: - tokenizer --- A copy of Eleuther AI's [gpt-neox-20b](https://huggingface.co/EleutherAI/gpt-neox-20b), with three special tokens added to mask PII: - `|||EMAIL_ADDRESS|||` - `|||PHONE_NUMBER|||` - `|||IP_ADDRESS|||`