@shaynelongpre.bsky.social - Shayne Longpre

09 Dec 2025 16:06 +0000

Excited to publish this piece! [contains quote post or other embedded content]

26 Nov 2025 16:02 +0000

Who is winning the open AI race? Our new study Economies of Open Intelligence maps @hf.co 851k models' downloads 2020→2025. 1) Power rebalance: US tech ↓; China + community ↑ 2) Models size & efficient ↑ (MoE, quant, multimodal) 3) Intermediary layers ↑ (adapters/quantizers) 4) Transparency ↓ /🧵

28 Oct 2025 14:01 +0000

📢Thrilled to introduce ATLAS 🗺️: the largest multilingual scaling study to-date—we ran 774 exps (10M-8B params, 400+ languages) to answer: 🌍 Is scaling diff by lang? 🧙‍♂️ Can we model the curse of multilinguality? ⚖️ Pretrain vs finetune from checkpoint? 🔀 X-lingual transfer scores across langs? 1/🧵

06 May 2025 13:49 +0000

Delighted to see BigGen Bench paper receive the 🏆best paper award 🏆at NAACL 2025! BigGen Bench introduces fine-grained, scalable, & human-aligned evaluations: 📈 77 hard, diverse tasks 🛠️ 765 exs w/ ex-specific rubrics 📋 More human-aligned than previous rubrics 🌍 10 languages, by native speakers 1/

22 Apr 2025 20:44 +0000

🛬 in Singapore for #ICLR2025! DM me to catch up—but only if you have a local food/bar/event rec!

14 Apr 2025 15:28 +0000

Thrilled our global data ecosystem audit was accepted to #ICLR2025! Empirically, it shows: 1️⃣ Soaring synthetic text data: ~10M tokens (pre-2018) to 100B+ (2024). 2️⃣ YouTube is now 70%+ of speech/video data but could block third-party collection. 3️⃣ <0.2% of data from Africa/South America. 1/

09 Apr 2025 15:25 +0000

This week, @stanfordhai.bsky.social released the 2025 AI Index. It’s well worth reading to understand the evolving ecosystem of AI. Some highlights that stood out to me: 1/

01 Apr 2025 16:52 +0000

Excited to speak at the workshop on Technical AI Governance in Vancouver this summer! #ICML2025 [contains quote post or other embedded content]

13 Mar 2025 15:59 +0000

Thank you @willknight.bsky.social for excellent coverage of our new proposal! https://www.wired.com/story/ai-researchers-new-system-report-bugs/ [contains quote post or other embedded content]

13 Mar 2025 15:59 +0000

What are 3 concrete steps that can improve AI safety in 2025? 🤖⚠️ Our new paper, “In House Evaluation is Not Enough” has 3 calls-to-actions to empower evaluators: 1️⃣ Standardized AI flaw reports 2️⃣ AI flaw disclosure programs + safe harbors. 3️⃣ A coordination center for transferable AI flaws. 1/🧵

26 Feb 2025 18:15 +0000

Thrilled to be at #AAAI2025 for our tutorial, “AI Data Transparency: The Past, Present, and Beyond.” We’re presenting the state of transparency, tooling, and policy, from the Foundation Model Transparency Index, Factsheets, the the EU AI Act to new frameworks like @MLCommons’ Croissant. 1/

19 Feb 2025 16:32 +0000

I compiled a list of resources for understanding AI copyright challenges (US-centric). 📚 ➡️ why is copyright an issue for AI? ➡️ what is fair use? ➡️ why are memorization and generation important? ➡️ how does it impact the AI data supply / web crawling? 🧵