@far.ai on Bluesky

JavaScript RequiredThis is a heavily interactive web application, and JavaScript is required. Simple HTML interfaces are possible, but that is not what this is. Learn more about Bluesky at bsky.social and atproto.com.

Post

FAR.AI

far.ai

did:plc:y3gesurmkjrpbuw2b47qiplf

1/ Most safety tests only check if a model will follow harmful instructions. But what happens if someone removes its safeguards so it agrees? We built the Safety Gap Toolkit to measure the gap between what a model will agree to do and what it can do. 🧵

2025-08-12T17:02:11.312Z