This is a heavily interactive web application, and JavaScript is required. Simple HTML interfaces are possible, but that is not what this is.
Post
Adrian Chan
gravity7.bsky.social
did:plc:4rcqtj5petmffaalnh65jn6x
Those #LLM reward models like sycophancy even more than you do!
Researchers find preferences for verbosity, listicles, vagueness, and jargon even higher among LLM-based reward models (synthetic data) than among us humans.
#AI #AIalignment
arxiv.org/abs/2506.05339
https://arxiv.org/abs/2506.05339
2025-06-09T15:01:46.916Z