@wendlerc.bsky.social - Chris Wendler

Postdoc at the interpretable deep learning lab at Northeastern University, deep learning, LLMs, mechanistic interpretabilityhttps://bsky.app/profile/wendlerc.bsky.social@wendlerc.bsky.social - Chris Wendlerhttps://bsky.app/profile/wendlerc.bsky.social/post/3mfmmcemf5c2nI am not very disciplined about syncing my bluesky and x account, if you are interested what I am up to please check out my x account x.com/wendlerch or website wendlerc.github.io https://wendlerc.github.io24 Feb 2026 16:41 +0000at://did:plc:5te5aeoznqwlt3yady3lcdbi/app.bsky.feed.post/3mfmmcemf5c2nhttps://bsky.app/profile/wendlerc.bsky.social/post/3lmcjcjidw22oCheck out Sheridan’s work on concept induction circuits -- the soft version of induction we were promised a while ago :) During our multilingual concept patching experiments I have always been wondering whether it is those circuits doing the work. Finally, some evidence: [contains quote post or other embedded content]08 Apr 2025 12:51 +0000at://did:plc:5te5aeoznqwlt3yady3lcdbi/app.bsky.feed.post/3lmcjcjidw22ohttps://bsky.app/profile/wendlerc.bsky.social/post/3lkvxo3fxnk2pIn case you ever wondered what you could do if you had SAEs for intermediate results of diffusion models, we trained SDXL Turbo SAEs on 4 blocks for you. We noticed that they specialize into a "composition", a "detail", and a "style" block. And one that is hard to make sense of.21 Mar 2025 19:39 +0000at://did:plc:5te5aeoznqwlt3yady3lcdbi/app.bsky.feed.post/3lkvxo3fxnk2phttps://bsky.app/profile/wendlerc.bsky.social/post/3lknwx7h2fc22Apply to Akhil's lab, he is great! [contains quote post or other embedded content]18 Mar 2025 15:04 +0000at://did:plc:5te5aeoznqwlt3yady3lcdbi/app.bsky.feed.post/3lknwx7h2fc22https://bsky.app/profile/wendlerc.bsky.social/post/3lifahdiw6s2fThis seems like an elegant idea! [contains quote post or other embedded content]17 Feb 2025 17:10 +0000at://did:plc:5te5aeoznqwlt3yady3lcdbi/app.bsky.feed.post/3lifahdiw6s2fhttps://bsky.app/profile/wendlerc.bsky.social/post/3ld6fxnim3s2hThe resources you find online on transformers are just next level... My jaw dropped when I first stumbled upon this video series: https://www.youtube.com/watch?v=V3NQaDR3xI4&list=PLoyGOS2WIonajhAVqKUgEMNmeq3nEeM5113 Dec 2024 08:54 +0000at://did:plc:5te5aeoznqwlt3yady3lcdbi/app.bsky.feed.post/3ld6fxnim3s2hhttps://bsky.app/profile/wendlerc.bsky.social/post/3lbsiayjet22bbit grumpy but great summary of the tokenformer paper https://www.youtube.com/watch?v=gfU5y7qCxF025 Nov 2024 21:38 +0000at://did:plc:5te5aeoznqwlt3yady3lcdbi/app.bsky.feed.post/3lbsiayjet22bhttps://bsky.app/profile/wendlerc.bsky.social/post/3lbevptbqjk2xIn case you also wondered how to derive the maximal update parametrisation (muP) learning rate for ADAM. I did a short write up: tinyurl.com/mup-for-adam. Thanks Ilia Badanin and Eugene Golikov for your help on this. https://tinyurl.com/mup-for-adam20 Nov 2024 12:02 +0000at://did:plc:5te5aeoznqwlt3yady3lcdbi/app.bsky.feed.post/3lbevptbqjk2x