This is a heavily interactive web application, and JavaScript is required. Simple HTML interfaces are possible, but that is not what this is.
Post
Vahe Taamazyan
vaheta.bsky.social
did:plc:vi7zanisnr6byopiyduh4fr2
2/
Participants were given multi-view, multi-modal images and tasked with training models that not only detect objects, but also predict their full 3D pose.
That means 3D position + rotation - exactly what a robot needs to grasp and manipulate objects accurately.
2025-06-10T00:25:36.976Z