Microsoft has also demonstrated the capabilities of VASA-1 through several videos, including an animated rendition of the Mona Lisa rapping. The model allows users to adjust features such as head movements and gaze direction. In its offline mode, VASA-1 produces videos at a resolution of 512x512 pixels and 45 frames per second, while the online mode supports video generation at up to 40 frames per second. Despite its innovative features, Microsoft has stated that it does not intend to commercialize VASA-1 due to concerns about the potential misuse of the technology in creating deepfake content.
Source: Microsoft via Tweakers