@samidh.bsky.social - Samidh

27 May 2026 17:12 +0000

Very cool! @julietshen.bsky.social's independent tests show that our new model CoPE-B cooks :-) Direct link to her results: https://github.com/julietshen/cope-evaluation/blob/main/RESULTS.md [contains quote post or other embedded content]

27 May 2026 16:25 +0000

Exactly right. Speech shouldn't be ruled by the platform hegemons. Your rules should rule. [contains quote post or other embedded content]

27 May 2026 14:20 +0000

Super pumped to release CoPE-B, our latest policy-adaptive content classification model. It delivers frontier-level accuracy in a self-hostable package that's orders of magnitude cheaper to run-- opening up new possibilities in trustworthy platform design. Details: https://blog.zentropi.ai/meet-cope-b-frontier-quality-content-classification-you-can-self-host/

11 May 2026 16:18 +0000

Check out how the @oversightboard.bsky.social used Zentropi to better understand how child marriage-related content manifests on Meta's platforms. Fantastic example of how advanced content labeling technologies can strengthen both our online and offline world. https://blog.zentropi.ai/how-the-oversight-board-uses-zentropi-to-study-policy-impact-at-scale/

14 Apr 2026 17:23 +0000

It has been incredible partnering with character.ai since the very start of zentropi.ai. We're excited to share some details of that partnership with this case study. Anyone creating AI-powered systems might find it interesting! https://blog.zentropi.ai/how-zentropi-partners-with-character-ai/

18 Mar 2026 16:54 +0000

One of the things we've been thinking about a lot at Zentropi is: what happens when AI agents need to make judgment calls about content — not humans reviewing a queue, but agents acting autonomously?

10 Mar 2026 22:11 +0000

There's a major gap in content safety tooling: classifiers typically only score complete text. When you're working with generative AI, "complete text" means the user already saw it. That's too late. So we built a streaming classifier that we're releasing today! Here's what we did and why. 🧵...

19 Feb 2026 18:55 +0000

Zentropi is now integrated into Coop, @roost.tools's open source moderation platform. You can write a content policy in plain English on Zentropi, plug it into Coop as a signal, and have a moderation pipeline running in minutes.

26 Jan 2026 18:55 +0000

I can has cats. [contains quote post or other embedded content]

26 Jan 2026 18:03 +0000

Just shipped Zentropi's most requested feature: image classification! Now analyze images against your own policies, at scale. To power it we built cope-b-12b, a new multimodal model w/ native vision. Check out the cat detector we made in < 1 min. 🐱 blog.zentropi.ai/zentropi-now-labels-images/ https://blog.zentropi.ai/zentropi-now-labels-images/

21 Jan 2026 23:21 +0000

If you are looking for a technical description of how X rots your brain, look no further than their github post on the 'X algorithm'. It is pure, unadulterated behavioral engagement maximization that amplifies the very worst human impulses. https://github.com/xai-org/x-algorithm

15 Jan 2026 19:21 +0000

Why are we just giving away all our secrets? Well, it is our hope that it helps the ecosystem further advance the state of the art in policy-steerable content classification, which is foundational to a more trustworthy internet. [contains quote post or other embedded content]

13 Jan 2026 20:30 +0000

Dave just published a Zentropi labeler that can precisely identify requests at prompting an AI model to undress a person in a photo. The tools exist to easily deal with this problem -- platforms just need to choose to use them. If you are the developer of an AI system, please use this guardrail! [contains quote post or other embedded content]

03 Dec 2025 00:53 +0000

This was such a cool experiment that I created a Zentropi labeler with a simplified version of the authors' Partisan Animosity criteria. Now anyone can experiment directly with using this labeler to try to reduce the temperature of affective polarization in their feeds. https://zentropi.ai/labelers/b3044134-88e5-4ff8-9f4c-b7387d693b39 [contains quote post or other embedded content]

13 Nov 2025 22:47 +0000

We just wrote an in-depth post about Toxic Content labeling. It presents a new way of defining toxic speech online-- and illustrates the importance of observable features for accurate language model interpretability. Would love to hear how YOU define toxicity, too! https://blog.zentropi.ai/observations-on-toxicity/

12 Nov 2025 00:07 +0000

Awesome to see how this is already being used! One of the most useful aspects is that the published policies show what it takes to write content rules that can be accurately interpreted by language models. We hope this can be a boost to the broader content policy community. [contains quote post or other embedded content]

10 Nov 2025 23:58 +0000

This was a fun launch! It turns Zentropi into a Github for Content Labelers. You can share content policies with others and build off each other's work. It's the easiest way of deploying a fully customizable classifier. Check out the policies @dwillner.bsky.social created at zentropi.ai/u/dave [contains quote post or other embedded content]

27 Aug 2025 07:19 +0000

This response to the Raine tragedy from OpenAI does something remarkable: it has the humility to acknowledge that a *product failure* led to real-world harm. Despite horrific circumstances, it has a rare degree of honesty that I wish tech companies would show more often. https://openai.com/index/helping-people-when-they-need-it-most/