This is a heavily interactive web application, and JavaScript is required. Simple HTML interfaces are possible, but that is not what this is.
Post
Hanlin Zhang
hlzhang109.bsky.social
did:plc:xqluctmmsgrusbj3lqtjdd7p
[2/4] Can LLMs self-improve by verifying their own outputs? This paper says yes—with a twist. The key lies in a measure: the Generation-Verification Gap (GV-Gap) that scales with pretraining FLOPs in a log-linear trend.
Oral @yus167.bsky.social 6A: Sat 26 Apr 4:18-4:30.
(arxiv.org/abs/2412.02674)
https://arxiv.org/abs/2412.02674
2025-04-23T01:35:59.936Z