This is a heavily interactive web application, and JavaScript is required. Simple HTML interfaces are possible, but that is not what this is.
Post
Simon Willison
simonwillison.net
did:plc:kft6lu4trxowqmter2b6vg6z
Here's a uv one-liner that downloads and runs the MLX model against a local mp3 file
uv run --with mlx-audio python -m mlx_audio.stt.generate \
--model mlx-community/VibeVoice-ASR-4bit \
--audio lenny.mp3 --output-path lenny \
--format json --verbose --max-tokens 32768
2026-04-27T23:53:00.860Z