Google’s DeepMind artificial intelligence lab is working on a new technology that can create audio recordings and even dialogue along with videos. It is in the laboratory shared his progress about a video-to-audio (V2A) technology project that can be paired with Google Veo and other video creation tools Sora from OpenAI. In a blog post, the DeepMind team explains that the system can understand raw pixels and combine that information with text prompts to create sound effects for what’s happening on the screen. Note that the tool can also be used to create soundtracks for traditional footage such as silent films and any other video without sound.
DeepMind’s researchers trained the technology on artificial intelligence-generated annotations containing detailed descriptions and dialogue transcripts of videos, audios and sounds. By doing so, the technology learned to associate specific sounds with visual scenes, they said. whom TechCrunch notes that the DeepMind team isn’t the first to release an AI tool that can create sound effects — ElevenLabs one was released recently – and it won’t be the last. “Our research differs from existing video-audio solutions because it can understand raw pixels and adding text query is optional,” the team writes.
Although the text query is optional, it can be used to shape and improve the final product to be as accurate and realistic as possible. You can enter positive cues to direct the output toward the sounds you want, for example, or negative cues to steer it away from the sounds you don’t want. In the example below, the team used the prompt: “Cinematic, thriller, horror, music, tension, atmosphere, footsteps on concrete.
The researchers admit that they are still working to address the current limitations of V2A technology, such as the degradation of output audio quality that can occur when the source video contains distortions. They are still working on improving the lip sync for the generated dialogue. In addition, they promise to put the technology through “rigorous safety evaluations and testing” before releasing it to the world.
This article contains affiliate links; we may earn a commission if you click on such a link and make a purchase.