This is a heavily interactive web application, and JavaScript is required. Simple HTML interfaces are possible, but that is not what this is.
Post
François Fleuret
francois.fleuret.org
did:plc:3x6fjk6uqc5lynzzjecmetzh
- Multi-token prediction: sums the training over multiple future tokens, possibly with additional readout heads.
- FlashAttention: computes the attention on the fly, avoiding a memory footprint O(T^2) (+ optimizes very carefully for the GPU!)
2025-04-28T06:49:45.778Z