Bartosz Cywiński's picture

2 7 21

Bartosz Cywiński

bcywinski

·

https://cywinski.github.io/

AI & ML interests

Mechanistic Interpretability

Recent Activity

updated a model 4 days ago

bcywinski/qwen-3-8b-taboo-wave-system

published a model 4 days ago

bcywinski/qwen-3-8b-taboo-wave-system

updated a model 4 days ago

bcywinski/gemma-2-9b-it-taboo-wave-no-system-prompt

View all activity

Organizations

None yet

bcywinski's activity

commented a paper 13 days ago

Towards eliciting latent knowledge from LLMs with mechanistic interpretability

Paper • 2505.14352 • Published 14 days ago • 9 •

commented a paper 4 months ago

SAeUron: Interpretable Concept Unlearning in Diffusion Models with Sparse Autoencoders

Paper • 2501.18052 • Published Jan 29 • 8 •