This is a heavily interactive web application, and JavaScript is required. Simple HTML interfaces are possible, but that is not what this is.
Post
David Marx
digthatdata.bsky.social
did:plc:yeplumkwwcfqh5gr7burql3e
DeepSeek v3 released - model, code, and writeup https://github.com/deepseek-ai/DeepSeek-V3
Some highlights:
* Multi-token prediction training objective
* FP8 + MoE for distributed training efficiency
* New "DualPipe" PP algorithm
* DeepSeekMoE + MH-Latent-A (see also DSv2)
* Impressive benchmarks, esp. math
2024-12-26T12:48:16.854Z