Vidu, China’s first Sora-level text-to-video, just launched!

Sdílet
Vložit
  • čas přidán 27. 04. 2024
  • Shengshu Technology and Tsinghua University in China have introduced Vidu, a text-to-video model that can produce 16-second, 1080p resolution videos with just one click. This announcement was made at the 2024 Zhongguancun Forum in Beijing, where Vidu was presented as a formidable rival to OpenAI’s Sora. Vidu, like Sora, can generate 16-second clips at 1080p resolution.
    Vidu is built on a Universal Vision Transformer (U-ViT) architecture, which, according to the company, enables it to simulate the real world with multi-camera view generation. This architecture was reportedly developed by the Shengshu Technology team in September 2022, predating the diffusion transformer (DiT) architecture used by Sora.
    The company claims that Vidu can generate videos with complex scenes that adhere to real-world physics, including realistic lighting and shadows, and detailed facial expressions. The model also exhibits a rich imagination, creating surreal content that does not exist, with depth and complexity. Vidu’s multi-camera capabilities allow it to generate dynamic shots, seamlessly transitioning between long shots, close-ups, and medium shots within a single scene. During its demo, the company attempted to recreate scenes similar to those shared by OpenAI during Sora’s release. While Vidu is an impressive achievement and a testament to China’s rapid progress in AI research, a side-by-side comparison with Sora shows that the videos generated by Vidu do not match Sora’s level of realism. The output, while impressive, lacks in terms of visual fidelity.
    However, it’s important to note that the temporal consistency achieved by Vidu is commendable, and this technology has the potential for further refinement and improvement over time. We will continue to follow on Vidu and its progress in the future as it becomes available.
    Thank you for watching!
    Keywords: #ai #texttovideo #artificialintelligence

Komentáře • 5

  • @AITalkingHead
    @AITalkingHead  Před měsícem +3

    Vidu is the answer to Sora text-to-video and now China has it too! Although Vidu may not be as good as Sora yet, however it sure is on the right track to have an alternative option. Will you try Vidu when it becomes available for public use? What use cases does Vidu or Sora have in our society? Please tell us what you think about text-to-video technology and what are the possible use cases down in the comment. Thank you for sharing!

    • @GenerativeMind01
      @GenerativeMind01 Před měsícem +1

      No matter Sora or Vidu, the big question is that this new tech is energy-intensive and that's why it limits only 16 seconds. It won't be a mainstream anytime soon until they figure out how to reduce the cost of energy.

  • @peteryu515
    @peteryu515 Před měsícem +1

    This is an expensive tech and it probably takes a large number of AI servers to crank out 16 seconds of video. Now they've launched it and time to enhance it. Hopefully, we all get to try it soon!

  • @GenerativeMind01
    @GenerativeMind01 Před měsícem +1

    It may not be perfect but given time it will get there. The most important thing is we have an option now! Congrats!

    • @peteryu515
      @peteryu515 Před měsícem +1

      You are absolutely right!