Bartosz Cywiński's picture

2 7 21

Bartosz Cywiński

bcywinski

·

https://cywinski.github.io/

AI & ML interests

Mechanistic Interpretability

Recent Activity

updated a model 4 days ago

bcywinski/qwen-3-8b-taboo-wave-system

published a model 4 days ago

bcywinski/qwen-3-8b-taboo-wave-system

updated a model 4 days ago

bcywinski/gemma-2-9b-it-taboo-wave-no-system-prompt

View all activity

Organizations

None yet

bcywinski's activity

upvoted a paper 13 days ago

Towards eliciting latent knowledge from LLMs with mechanistic interpretability

Paper • 2505.14352 • Published 14 days ago • 9

upvoted an article 21 days ago

Article

Vision Language Models (Better, Faster, Stronger)

By

and 4 others •

23 days ago

• 406

upvoted 3 papers 4 months ago

Precise Parameter Localization for Textual Generation in Diffusion Models

Paper • 2502.09935 • Published Feb 14 • 12

No Task Left Behind: Isotropic Model Merging with Common and Task-Specific Subspaces

Paper • 2502.04959 • Published Feb 7 • 11

SAeUron: Interpretable Concept Unlearning in Diffusion Models with Sparse Autoencoders

Paper • 2501.18052 • Published Jan 29 • 8

upvoted a paper 7 months ago

Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders

Paper • 2410.22366 • Published Oct 28, 2024 • 83

upvoted a collection 8 months ago

🔍 Interpretability & Analysis of LMs

Outstanding research in LM interpretability and evaluation, summarized • 115 items • Updated about 4 hours ago • 104