This is a heavily interactive web application, and JavaScript is required. Simple HTML interfaces are possible, but that is not what this is.
Post
Simon Willison
simonwillison.net
did:plc:kft6lu4trxowqmter2b6vg6z
I published some notes on OpenAI's new text-to-speech and speech-to-text models. They're promising, but like other LLM-driven multi-modal models they appear to suffer from the prompt-injection-adjacent problem of mixing instructions and data in the same token stream
https://simonwillison.net/2025/Mar/20/new-openai-audio-models/
2025-03-20T20:41:30.953Z