This is a heavily interactive web application, and JavaScript is required. Simple HTML interfaces are possible, but that is not what this is.
Post
Bo Liu (Benjamin Liu)
benjamin-eecs.bsky.social
did:plc:j6pnzj2ky4dl7nafku3qgbms
We're excited about self-play unlocking continuously improving agents. RL selects CoT patterns from LLMs. Games=perfect testing grounds.
SPIRAL: models learn via self-competition. Kuhn Poker → +8.7% math, +18.1 Minerva Math! 🃏
Paper: https://huggingface.co/papers/2506.24119
Code: github.com/spiral-rl/spiral
2025-07-01T20:11:45.544Z