This is a heavily interactive web application, and JavaScript is required. Simple HTML interfaces are possible, but that is not what this is.
Post
HackerNoon
hackernoon.com
did:plc:kbzotn4ippvrqllcitxglgm2
We benchmark multimodal speech models across ASR, SLU, and PSP tasks using diverse datasets and compare them with strong cascaded baselines. #audiolanguagemodel
https://hackernoon.com/evaluating-multimodal-speech-models-across-diverse-audio-tasks
2025-06-18T09:00:07.351Z