ByteDance OmniHuman turns photos into realistic videos

ByteDance has developed an AI system that can generate realistic full-body videos from a single photograph. The new technology, called OmniHuman, can create videos of people speaking, singing, and moving naturally with matching gestures. As reported by Michael Nuñez, the system was trained on over 18,700 hours of human video data. OmniHuman combines multiple input types, including text, audio, and body movements, enabling it to learn from larger and more diverse datasets than previous methods. The ByteDance researchers explained their breakthrough in a paper published on arXiv, noting that their “omni-conditions” training strategy significantly reduces data wastage. While this technology could transform digital entertainment and communications, experts also warn about potential misuse in creating deceptive synthetic media.

ByteDance OmniHuman turns photos into realistic videos

Related posts:

Stay up-to-date: