This is a heavily interactive web application, and JavaScript is required. Simple HTML interfaces are possible, but that is not what this is.
Post
Ameya P.
bayesiankitten.bsky.social
did:plc:kjgrwkc2upr2vhyvlyyikd2v
Breaking the 8-model merge limit was tough, but we scaled to merging 200+ models! The secret? Iterative finetuning + merging *over time*.
The time axis unlocks scalable mergeability. Merging has surprising scaling gains across size & compute budgets.
All the gory details ⬇️
[contains quote post or other embedded content]
2024-12-11T18:16:15.255Z