This is a heavily interactive web application, and JavaScript is required. Simple HTML interfaces are possible, but that is not what this is.
Post
FAR.AI
far.ai
did:plc:y3gesurmkjrpbuw2b47qiplf
1/
Most safety tests only check if a model will follow harmful instructions. But what happens if someone removes its safeguards so it agrees?
We built the Safety Gap Toolkit to measure the gap between what a model will agree to do and what it can do. 🧵
2025-08-12T17:02:11.312Z