Monday, 24 February 2025
28.1 C
Singapore
28.3 C
Thailand
23.1 C
Indonesia
27.5 C
Philippines

Microsoft’s AI could soon make your photos talk and sing

Explore how Microsoft's new AI tool VASA-1 can bring your photos to life by creating realistic videos of them talking and singing.

Microsoft Research Asia has just unveiled VASA-1, an experimental AI tool that could transform still images or drawings of people into realistic videos where they appear to talk or sing. Using an existing audio file, this tool can animate your photos with facial expressions, head movements, and perfectly synced lip movements that match the audio’s speech or song.

On the project’s webpage, you can find numerous examples that showcase how lifelike these animations can be. Although some lip and head movements might still look a bit mechanical and not perfectly in sync, the overall effect is convincing enough that it could easily be mistaken for real footage.

There’s a significant potential for misuse, particularly in the creation of deepfake videos, which is something Microsoft’s researchers are quite aware of. Consequently, they have decided against releasing any public demos, APIs, or additional details about the implementation until they can ensure the tool will be used responsibly and in accordance with stringent regulations. They haven’t mentioned specific safeguards to prevent misuse by malicious actors for harmful purposes like creating deepfake pornography or misinformation campaigns.

Despite these concerns, the technology promises several beneficial applications. It could enhance educational equity and improve accessibility for individuals with communication challenges by giving them access to an avatar that can communicate on their behalf. Additionally, this tool could provide companionship and therapeutic support, especially in programmes that offer interactions with AI-powered characters.

VASA-1 was trained using the VoxCeleb2 dataset, which includes over 1 million spoken expressions from 6,112 celebrities extracted from YouTube videos. Interestingly, it works not just on real faces but also on artistic ones. An amusing example is the animation of the Mona Lisa synced with an audio clip of Anne Hathaway’s viral rendition of Lil Wayne’s “Paparazzi,” which is quite delightful and worth a watch.

Hot this week

Apple adds new recipes section to its News app

Apple is adding a new recipes section to its News app, offering News Plus subscribers access to thousands of recipes and curated food stories.

Nothing Phone 3A and 3A Pro leaks show complete design and key specs

Leaked videos and images reveal the Nothing Phone 3A and 3A Pro, showing full designs, key specs, AI features, and camera details before launch.

Singapore businesses embrace AI to boost efficiency

Singapore businesses and government agencies use AI to improve efficiency, reduce costs, and enhance productivity, as shared at Microsoftโ€™s AI Tour.

The Vision Pro is now easier to share, and getting a new iPhone app

Appleโ€™s Vision 2.4 update makes sharing the Vision Pro easier, introduces a new iPhone app for content discovery, and adds the Spatial Gallery app.

LG unveils new SKS branding for luxury kitchen suite at KBIS 2025

LG rebrands Signature Kitchen Suite to SKS at KBIS 2025, introducing new luxury appliances like a free-zone induction range and an advanced island system.

Appleโ€™s visionOS 2.4 update enhances Vision Pro with AI, Spatial Gallery, and more

Appleโ€™s visionOS 2.4 update for Vision Pro arrives in April, bringing Apple Intelligence, Spatial Gallery, a companion iPhone app, and improved Guest mode.

Hasselblad’s legacy in space photography meets OPPO Find N5: Is this the next frontier for mobile imaging?

OPPO and Hasselblad redefine mobile photography with the Find N5, blending cutting-edge AI with legendary imaging expertise. Is this the next frontier?

The best new features coming to your iPhone with iOS 18.4

iOS 18.4 introduces priority notifications, new Image Playground styles, expanded default apps, CarPlay updates, Ambient Music, and more language support.

Did xAI mislead the public about Grok 3โ€™s benchmarks?

xAI is under scrutiny for allegedly misleading AI benchmark results, with OpenAI employees questioning its claims about Grok 3โ€™s performance.

Related Articles