gemma-2-9b-it-taboo
Collection
Set of Taboo model organisms trained for arxiv.org/abs/2505.14352
•
20 items
•
Updated
•
1
BibTeX:
@article{cywinski2025towards,
title={Towards eliciting latent knowledge from LLMs with mechanistic interpretability},
author={Cywi{\'n}ski, Bartosz and Ryd, Emil and Rajamanoharan, Senthooran and Nanda, Neel},
journal={arXiv preprint arXiv:2505.14352},
year={2025}
}