<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"><channel><description>PhD @ MIT. Prev: Google Deepmind, Apple, Stanford. 🇨🇦 Interests: AI/ML/NLP, Data-centric AI, transparency &amp; societal impact</description><link>https://bsky.app/profile/shaynelongpre.bsky.social</link><title>@shaynelongpre.bsky.social - Shayne Longpre</title><item><link>https://bsky.app/profile/shaynelongpre.bsky.social/post/3m7kwgvyz3c2v</link><description>Excited to publish this piece!&#xA;&#xA;[contains quote post or other embedded content]</description><pubDate>09 Dec 2025 16:06 +0000</pubDate><guid isPermaLink="false">at://did:plc:evvussoazdkvsld475dfbuci/app.bsky.feed.post/3m7kwgvyz3c2v</guid></item><item><link>https://bsky.app/profile/shaynelongpre.bsky.social/post/3m6ka5mi4222x</link><description>Who is winning the open AI race?&#xA;&#xA;Our new study Economies of Open Intelligence maps @hf.co 851k models&#39; downloads 2020→2025.&#xA;&#xA;1) Power rebalance: US tech ↓; China + community ↑&#xA;2) Models size &amp; efficient ↑ (MoE, quant, multimodal)&#xA;3) Intermediary layers ↑ (adapters/quantizers)&#xA;4) Transparency ↓&#xA;&#xA;/🧵</description><pubDate>26 Nov 2025 16:02 +0000</pubDate><guid isPermaLink="false">at://did:plc:evvussoazdkvsld475dfbuci/app.bsky.feed.post/3m6ka5mi4222x</guid></item><item><link>https://bsky.app/profile/shaynelongpre.bsky.social/post/3m4b3uofvms2k</link><description>📢Thrilled to introduce ATLAS 🗺️: the largest multilingual scaling study to-date—we ran 774 exps (10M-8B params, 400+ languages) to answer:&#xA;&#xA;🌍 Is scaling diff by lang?&#xA;&#xA;🧙‍♂️ Can we model the curse of multilinguality?&#xA;&#xA;⚖️ Pretrain vs finetune from checkpoint?&#xA;&#xA;🔀 X-lingual transfer scores across langs?&#xA;&#xA;1/🧵</description><pubDate>28 Oct 2025 14:01 +0000</pubDate><guid isPermaLink="false">at://did:plc:evvussoazdkvsld475dfbuci/app.bsky.feed.post/3m4b3uofvms2k</guid></item><item><link>https://bsky.app/profile/shaynelongpre.bsky.social/post/3loizm7whl22b</link><description>Delighted to see BigGen Bench paper receive the 🏆best paper award 🏆at NAACL 2025!&#xA;&#xA;BigGen Bench introduces fine-grained, scalable, &amp; human-aligned evaluations:&#xA;&#xA;📈 77 hard, diverse tasks&#xA;🛠️ 765 exs w/ ex-specific rubrics&#xA;📋 More human-aligned than previous rubrics&#xA;🌍 10 languages, by native speakers&#xA;&#xA;1/</description><pubDate>06 May 2025 13:49 +0000</pubDate><guid isPermaLink="false">at://did:plc:evvussoazdkvsld475dfbuci/app.bsky.feed.post/3loizm7whl22b</guid></item><item><link>https://bsky.app/profile/shaynelongpre.bsky.social/post/3lngkacmpfc2p</link><description>🛬 in Singapore for #ICLR2025!&#xA;&#xA;DM me to catch up—but only if you have a local food/bar/event rec!</description><pubDate>22 Apr 2025 20:44 +0000</pubDate><guid isPermaLink="false">at://did:plc:evvussoazdkvsld475dfbuci/app.bsky.feed.post/3lngkacmpfc2p</guid></item><item><link>https://bsky.app/profile/shaynelongpre.bsky.social/post/3lmrutfxins2y</link><description>Thrilled our global data ecosystem audit was accepted to #ICLR2025!&#xA;&#xA;Empirically, it shows:&#xA;&#xA;1️⃣ Soaring synthetic text data: ~10M tokens (pre-2018) to 100B+ (2024).&#xA;&#xA;2️⃣ YouTube is now 70%+ of speech/video data but could block third-party collection.&#xA;&#xA;3️⃣ &lt;0.2% of data from Africa/South America.&#xA;&#xA;1/</description><pubDate>14 Apr 2025 15:28 +0000</pubDate><guid isPermaLink="false">at://did:plc:evvussoazdkvsld475dfbuci/app.bsky.feed.post/3lmrutfxins2y</guid></item><item><link>https://bsky.app/profile/shaynelongpre.bsky.social/post/3lmfcdxsuyk2p</link><description>This week, @stanfordhai.bsky.social released the 2025 AI Index. It’s well worth reading to understand the evolving ecosystem of AI. Some highlights that stood out to me:&#xA;&#xA;1/</description><pubDate>09 Apr 2025 15:25 +0000</pubDate><guid isPermaLink="false">at://did:plc:evvussoazdkvsld475dfbuci/app.bsky.feed.post/3lmfcdxsuyk2p</guid></item><item><link>https://bsky.app/profile/shaynelongpre.bsky.social/post/3llrdhu55a22h</link><description>Excited to speak at the workshop on Technical AI Governance in Vancouver this summer!&#xA;&#xA;#ICML2025&#xA;&#xA;[contains quote post or other embedded content]</description><pubDate>01 Apr 2025 16:52 +0000</pubDate><guid isPermaLink="false">at://did:plc:evvussoazdkvsld475dfbuci/app.bsky.feed.post/3llrdhu55a22h</guid></item><item><link>https://bsky.app/profile/shaynelongpre.bsky.social/post/3lkbhosdbys2s</link><description>Thank you @willknight.bsky.social for excellent coverage of our new proposal!&#xA;&#xA;https://www.wired.com/story/ai-researchers-new-system-report-bugs/&#xA;&#xA;[contains quote post or other embedded content]</description><pubDate>13 Mar 2025 15:59 +0000</pubDate><guid isPermaLink="false">at://did:plc:evvussoazdkvsld475dfbuci/app.bsky.feed.post/3lkbhosdbys2s</guid></item><item><link>https://bsky.app/profile/shaynelongpre.bsky.social/post/3lkbhnjgsvs2s</link><description>What are 3 concrete steps that can improve AI safety in 2025? 🤖⚠️&#xA;&#xA;Our new paper, “In House Evaluation is Not Enough” has 3 calls-to-actions to empower evaluators:&#xA;&#xA;1️⃣ Standardized AI flaw reports&#xA;2️⃣  AI flaw disclosure programs + safe harbors. &#xA;3️⃣ A coordination center for transferable AI flaws.&#xA;&#xA;1/🧵</description><pubDate>13 Mar 2025 15:59 +0000</pubDate><guid isPermaLink="false">at://did:plc:evvussoazdkvsld475dfbuci/app.bsky.feed.post/3lkbhnjgsvs2s</guid></item><item><link>https://bsky.app/profile/shaynelongpre.bsky.social/post/3lj3ybtxdmk2p</link><description>Thrilled to be at #AAAI2025 for our tutorial, “AI Data Transparency: The Past, Present, and Beyond.” &#xA;&#xA;We’re presenting the state of transparency, tooling, and policy, from the Foundation Model Transparency Index, Factsheets, the the EU AI Act to new frameworks like @MLCommons’ Croissant.&#xA;&#xA;1/</description><pubDate>26 Feb 2025 18:15 +0000</pubDate><guid isPermaLink="false">at://did:plc:evvussoazdkvsld475dfbuci/app.bsky.feed.post/3lj3ybtxdmk2p</guid></item><item><link>https://bsky.app/profile/shaynelongpre.bsky.social/post/3lik7a34t3s2p</link><description>I compiled a list of resources for understanding AI copyright challenges (US-centric). 📚&#xA;&#xA;➡️ why is copyright an issue for AI? &#xA;➡️ what is fair use?&#xA;➡️ why are memorization and generation important?&#xA;➡️ how does it impact the AI data supply / web crawling?&#xA;&#xA;🧵</description><pubDate>19 Feb 2025 16:32 +0000</pubDate><guid isPermaLink="false">at://did:plc:evvussoazdkvsld475dfbuci/app.bsky.feed.post/3lik7a34t3s2p</guid></item></channel></rss>