This is a heavily interactive web application, and JavaScript is required. Simple HTML interfaces are possible, but that is not what this is.
Post
Fazl Barez
fbarez.bsky.social
did:plc:mbdddmxva4ofg5w6wait2mjs
Another red flag: models often silently correct errors within their reasoning steps. They may produce the correct final answer by reasoning steps that are not verbalised, while the steps they do verbalise remain flawed, creating an illusion of transparency. (5/9)
2025-07-01T15:41:40.708Z