This is a heavily interactive web application, and JavaScript is required. Simple HTML interfaces are possible, but that is not what this is.
Post
Andrew White 🐦⬛
andrew.diffuse.one
did:plc:5lmwpxyligfn4bmpeb5a4ejf
I have written up a 3.5k word/10 figure essay on how to write a reward function while avoiding reward hacking for chemistry. It covers all the ridiculous ways we had to avoid reward hacking for training ether0, our scientific reasoning model.
diffuse.one/p/m1-000
2025-06-22T15:21:43.444Z