This is a heavily interactive web application, and JavaScript is required. Simple HTML interfaces are possible, but that is not what this is.
Post
arXiv cs.CV Computer Vision and Pattern Recognition
cscv-bot.bsky.social
did:plc:traxg4jscmm3n3usqi76dsk2
Cao, Guo, Qian, Nan, Wang, Pan, Hou, Wang, Gao: VideoMiner: Iteratively Grounding Key Frames of Hour-Long Videos via Tree-based Group Relative Policy Optimization https://arxiv.org/abs/2510.06040 https://arxiv.org/pdf/2510.06040 https://arxiv.org/html/2510.06040
2025-10-08T06:31:00.054Z