This is a heavily interactive web application, and JavaScript is required. Simple HTML interfaces are possible, but that is not what this is.
Post
Levi Lelis
programsynthesis.bsky.social
did:plc:2iptqefiacnwdup3iky4jr2p
We had to perform simple changes to the neural policies' training pipeline to attain similar OOD generalization to that exhibited by programmatic ones.
In a grid-world problem, we used the same sparse observation space as used with the programmatic policies augmented with the agent's last action.
2025-07-02T22:12:28.488Z