This is a heavily interactive web application, and JavaScript is required. Simple HTML interfaces are possible, but that is not what this is.
Post
Pau Rodriguez
paurodriguez.bsky.social
did:plc:ya3ouewfitkljnf44qh72car
Summary:
🤝 Unifying activation steering w/ OT.
✨ Linear-AcT preserves distributions w/ interpretable ([0, 1]) strength.
💪 Robust: models/layers/modalities
💬 LLMs: toxicity mitigation, truthfulness and concept induction,
🌄 T2I: style induction and concept negation.
🚀 Negligible cost!
2024-12-10T13:09:50.820Z