@czyang.bsky.social - Ziyang Chen

Ph.D. Student @ UMich EECS. Multimodal learning, audio-visual learning and computer vision. Prev research Intern @Adobe and @Meta https://ificl.github.io/https://bsky.app/profile/czyang.bsky.social@czyang.bsky.social - Ziyang Chenhttps://bsky.app/profile/czyang.bsky.social/post/3lbvklevtbk27🎥 Introducing MultiFoley, a video-aware audio generation method with multimodal controls! 🔊 We can ⌨️Make a typewriter sound like a piano 🎹 🐱Make a cat meow like a lion roars! 🦁 ⏱️Perfectly time existing SFX 💥 to a video. arXiv: arxiv.org/abs/2411.17698 website: ificl.github.io/MultiFoley/27 Nov 2024 02:58 +0000at://did:plc:of7tyzi5arweotwpry5pkzrf/app.bsky.feed.post/3lbvklevtbk27