This is a heavily interactive web application, and JavaScript is required. Simple HTML interfaces are possible, but that is not what this is.
    
  Post
  Matthew Leavitt
  leavittron.bsky.social
  did:plc:tquzgamc6msaavcuv2paaopx
  Our curated data also allows us to train faster! We save 86.9% on compute (7.7x speedup) training a 2.7B model on our data to reach the same avg 5-shot accuracy as training on RPJv1 for 180B tokens, and save 70.1% on compute (3.4x speedup) to reach the same accuracy as DCLM
  2024-11-25T17:49:49.202Z