<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"><channel><description>Let&#39;s build AI&#39;s we can trust!</description><link>https://bsky.app/profile/fbarez.bsky.social</link><title>@fbarez.bsky.social - Fazl Barez</title><item><link>https://bsky.app/profile/fbarez.bsky.social/post/3m7g5sa7ojc2j</link><description>Incredibly excited to announce $1 Million prize pool to solve the world’s most important scientific problem in Interpretability. &#xA;&#xA;The goal is to turns hard interpretability questions into tools for human empowerment, oversight and governance.</description><pubDate>07 Dec 2025 18:35 +0000</pubDate><guid isPermaLink="false">at://did:plc:mbdddmxva4ofg5w6wait2mjs/app.bsky.feed.post/3m7g5sa7ojc2j</guid></item><item><link>https://bsky.app/profile/fbarez.bsky.social/post/3m2k2i7fvjc2k</link><description>🚨New AI Safety Course &#xA;@aims_oxford&#xA;!&#xA;&#xA;I’m thrilled to launch a new called AI Safety &amp; Alignment (AISAA) course on the foundations &amp; frontier research of making advanced AI systems safe and aligned at &#xA;@UniofOxford&#xA; &#xA;what to expect 👇&#xA;robots.ox.ac.uk/~fazl/aisaa/</description><pubDate>06 Oct 2025 16:40 +0000</pubDate><guid isPermaLink="false">at://did:plc:mbdddmxva4ofg5w6wait2mjs/app.bsky.feed.post/3m2k2i7fvjc2k</guid></item><item><link>https://bsky.app/profile/fbarez.bsky.social/post/3lz6jcxub3k24</link><description>🚀 Excited to have 2 papers accepted at #NeurIP2025! 🎉 congrats to my amazing co-authors!&#xA;&#xA;More details (and more bragging) soon! and maybe even more news on sep 25 👀&#xA;&#xA;See you all in… Mexico? San Diego? Copenhagen? Who knows! 🌍✈️</description><pubDate>19 Sep 2025 09:08 +0000</pubDate><guid isPermaLink="false">at://did:plc:mbdddmxva4ofg5w6wait2mjs/app.bsky.feed.post/3lz6jcxub3k24</guid></item><item><link>https://bsky.app/profile/fbarez.bsky.social/post/3lsvzwvwrhc2o</link><description>Excited to share our paper: &#34;Chain-of-Thought Is Not Explainability&#34;! We unpack a critical misconception in AI: models explaining their steps (CoT) aren&#39;t necessarily revealing their true reasoning. Spoiler: the transparency can be an illusion. (1/9) 🧵</description><pubDate>01 Jul 2025 15:41 +0000</pubDate><guid isPermaLink="false">at://did:plc:mbdddmxva4ofg5w6wait2mjs/app.bsky.feed.post/3lsvzwvwrhc2o</guid></item><item><link>https://bsky.app/profile/fbarez.bsky.social/post/3lsl6pqmi6s2y</link><description>Technology = power. AI is reshaping power — fast.&#xA;&#xA;Today’s AI doesn’t just assist decisions; it makes them. Governments use it for surveillance, prediction, and control — often with no oversight.&#xA;&#xA;Technical safeguards  aren’t enough on their own — but they’re essential for AI to serve society.</description><pubDate>27 Jun 2025 08:07 +0000</pubDate><guid isPermaLink="false">at://did:plc:mbdddmxva4ofg5w6wait2mjs/app.bsky.feed.post/3lsl6pqmi6s2y</guid></item><item><link>https://bsky.app/profile/fbarez.bsky.social/post/3lpmliipets2m</link><description>Come work with me at Oxford this summer! Paid research opportunity to:  &#xA;&#xA;White-box LLMs &amp; model security &#xA;Safe RL &amp; reward hacking &#xA;Interpretability &amp; governance tools &#xA;&#xA;Remote or Oxford. &#xA;&#xA;Apply by 30 May 23:59 UTC. DM with questions.</description><pubDate>20 May 2025 17:13 +0000</pubDate><guid isPermaLink="false">at://did:plc:mbdddmxva4ofg5w6wait2mjs/app.bsky.feed.post/3lpmliipets2m</guid></item><item><link>https://bsky.app/profile/fbarez.bsky.social/post/3lp7ezosj6s2v</link><description>Come work with me at Oxford! &#xA;&#xA;We’re hiring a Postdoc in Causal Systems Modelling to:&#xA;&#xA;- Build causal &amp; white-box models that make frontier AI safer and more transparent&#xA;- Turn technical insights into safety cases, policy briefs, and governance tools&#xA;]&#xA;&#xA;DM if you have any questions.</description><pubDate>15 May 2025 11:12 +0000</pubDate><guid isPermaLink="false">at://did:plc:mbdddmxva4ofg5w6wait2mjs/app.bsky.feed.post/3lp7ezosj6s2v</guid></item><item><link>https://bsky.app/profile/fbarez.bsky.social/post/3lmcgedgv422l</link><description>First-time Area Chair seeking advice! What helped you most when evaluating papers beyond just averaging scores? &#xA;&#xA;After suffering through unhelpful reviews as an author, I want to do right by papers in my track.</description><pubDate>08 Apr 2025 11:59 +0000</pubDate><guid isPermaLink="false">at://did:plc:mbdddmxva4ofg5w6wait2mjs/app.bsky.feed.post/3lmcgedgv422l</guid></item><item><link>https://bsky.app/profile/fbarez.bsky.social/post/3llr5rzk7wc2v</link><description>Technical AI Governance (TAIG) at #ICML2025 this July in Vancouver! &#xA;&#xA;Credit to &#xA;Ben and Lisa for all the work!&#xA;&#xA;We have a new centre at Oxford working on technical AI governance with Robert Trager and @maosbot.bsky.social many other great minds. We are hiring - please reach out!&#xA;Quote&#xA;&#xA;[contains quote post or other embedded content]</description><pubDate>01 Apr 2025 15:10 +0000</pubDate><guid isPermaLink="false">at://did:plc:mbdddmxva4ofg5w6wait2mjs/app.bsky.feed.post/3llr5rzk7wc2v</guid></item><item><link>https://bsky.app/profile/fbarez.bsky.social/post/3ljky7bly2k26</link><description>🔍 Excited to share our paper: &#34;Same Question, Different Words: A Latent Adversarial Framework for Prompt Robustness&#34;!</description><pubDate>04 Mar 2025 17:24 +0000</pubDate><guid isPermaLink="false">at://did:plc:mbdddmxva4ofg5w6wait2mjs/app.bsky.feed.post/3ljky7bly2k26</guid></item><item><link>https://bsky.app/profile/fbarez.bsky.social/post/3ljdjmxpmvk2m</link><description>New paper alert! 🚨&#xA;&#xA;Important question: Do SAEs generalise? &#xA;We explore the answerability detection in LLMs by comparing SAE features vs. linear residual stream probes. &#xA;&#xA;Answer: &#xA;probes outperform SAE features in-domain, out-of-domain generalization varies sharply between features and datasets. 🧵</description><pubDate>01 Mar 2025 18:14 +0000</pubDate><guid isPermaLink="false">at://did:plc:mbdddmxva4ofg5w6wait2mjs/app.bsky.feed.post/3ljdjmxpmvk2m</guid></item><item><link>https://bsky.app/profile/fbarez.bsky.social/post/3lffnztwq7c2g</link><description>&#xA;🚨 New Paper Alert: Open Problem in Machine Unlearning for AI Safety 🚨&#xA;&#xA;Can AI truly &#34;forget&#34;? While unlearning promises data removal, controlling emergent capabilities is a inherent challenge. Here&#39;s why it matters: 👇&#xA;&#xA;Paper: arxiv.org/pdf/2501.04952&#xA;1/8</description><pubDate>10 Jan 2025 16:58 +0000</pubDate><guid isPermaLink="false">at://did:plc:mbdddmxva4ofg5w6wait2mjs/app.bsky.feed.post/3lffnztwq7c2g</guid></item></channel></rss>