Sunday, 17 November 2024
25.7 C
Singapore

Microsoft’s AI could soon make your photos talk and sing

Explore how Microsoft's new AI tool VASA-1 can bring your photos to life by creating realistic videos of them talking and singing.

Microsoft Research Asia has just unveiled VASA-1, an experimental AI tool that could transform still images or drawings of people into realistic videos where they appear to talk or sing. Using an existing file, this tool can animate your photos with facial expressions, head movements, and perfectly synced lip movements that match the audio’s speech or song.

On the project’s webpage, you can find numerous examples that showcase how lifelike these animations can be. Although some lip and head movements might still look a bit mechanical and not perfectly in sync, the overall effect is convincing enough that it could easily be mistaken for real footage.

There’s a significant potential for misuse, particularly in the creation of deepfake videos, which is something Microsoft’s researchers are quite aware of. Consequently, they have decided against releasing any public demos, APIs, or additional details about the implementation until they can ensure the tool will be used responsibly and in accordance with stringent regulations. They haven’t mentioned specific safeguards to prevent misuse by malicious actors for harmful purposes like creating deepfake pornography or misinformation campaigns.

Despite these concerns, the technology promises several beneficial . It could enhance educational equity and improve accessibility for individuals with communication challenges by giving them access to an avatar that can communicate on their behalf. Additionally, this tool could provide companionship and therapeutic support, especially in programmes that offer interactions with AI-powered characters.

VASA-1 was trained using the VoxCeleb2 dataset, which includes over 1 million spoken expressions from 6,112 celebrities extracted from YouTube videos. Interestingly, it works not just on real faces but also on artistic ones. An amusing example is the animation of the Mona Lisa synced with an audio clip of Anne Hathaway’s viral rendition of Lil Wayne’s “Paparazzi,” which is quite delightful and worth a watch.

Hot this week

Ricoh and Fujifilm modernise data management with Informatica’s AI-powered cloud solutions

Ricoh and Fujifilm adopt Informatica's AI-powered cloud solutions to streamline data management, enhance decision-making, and improve global operations.

YugabyteDB: Unveiling the potential of database modernisation in APAC

YugabyteDB proves to be more than just a database; it's a pivotal solution aiding APAC businesses in navigating the digital landscape, showcasing scalability, resilience, and versatility, thereby playing a crucial role in regional digital transformation.

Roboyo expands hyperautomation expertise in Asia Pacific

Roboyo appoints Ignasi Peiris as Automation Engineering Manager to boost UiPath capabilities and drive hyperautomation adoption in Asia Pacific.

T-Mobile network infiltrated by hackers linked to China

China-linked hackers breached T-Mobile, accessing officials' data. T-Mobile says customers' data remains largely unaffected.

YouTube’s new AI music remixer could let you transform songs with ease

YouTube’s experimental AI remixer lets creators transform tracks into new genres, adding personal flair to Shorts with AI-powered custom soundtracks.

Alibaba’s quarterly profit rises 58% on cloud and international growth

Despite a challenging domestic market, Alibaba's quarterly profit jumped 58%, driven by cloud computing and international e-commerce growth.

18 states challenge SEC over crypto regulation enforcement

18 US states filed a lawsuit challenging the SEC’s authority over crypto regulation, seeking state-level control. The legal battle could reshape oversight.

Related Articles

Popular Categories