NVIDIA releases SANA-WM, a 2.6B open-source AI that turns text into 1-minute 720p videos.
- NVIDIA built SANA-WM with 2.6 billion parameters
- Model generates 720p videos from text prompts in one minute
- Runs on a single RTX 4090 GPU
NVIDIA just open-sourced SANA-WM, a breakthrough AI model that turns text prompts into one-minute 720p videos. The model, built with 2.6 billion parameters, runs locally on a single NVIDIA RTX 4090 graphics card and doesn’t need cloud servers. Researchers and developers can download it for free from the NVIDIA SANA GitHub page. This isn’t the first text-to-video model, but it’s one of the few that actually runs without expensive hardware or internet dependency. Most alternatives still rely on supercomputers or paid APIs, making SANA-WM a rare affordable option.
SANA-WM comes from NVIDIA’s research team, the same group behind the Stable Diffusion and Sana image generators. The model builds on their previous work but adds motion generation, turning static images into short video clips with just a few seconds of processing time. Unlike older video models that take minutes to render even low-res footage, SANA-WM keeps the clips at 720p resolution and stays smooth at 24-30 frames per second. For context, that’s the same quality as a YouTube video uploaded a decade ago.
The team tested SANA-WM on a variety of prompts, from simple scenes like “a cat sitting on a windowsill” to more complex ones like “a futuristic city at sunset with flying cars.” In each case, the model produced a one-minute clip that matched the prompt reasonably well. The videos aren’t Pixar quality, but they’re good enough for social media, presentations, or even quick prototype testing. The model also handles motion better than most open-source alternatives. Where other models stutter or blur objects in motion, SANA-WM keeps things relatively stable—though things like hair or fabric still look a bit plastic.
NVIDIA didn’t just release the model; they included a full training dataset and code to let others tweak it. That means if you’ve got the skills, you can fine-tune SANA-WM for your own projects. The team even published benchmarks showing it outperforms similar open-source models like Phenaki and Make-A-Video in some areas, especially when it comes to motion coherence. The catch? You’ll need a beefy GPU. While the model runs on a single RTX 4090, older cards or lower-end GPUs will struggle or fail entirely.
This release fits into NVIDIA’s push to make AI tools more accessible. The company has been open-sourcing models left and right lately, from 3D scene generators to real-time voice changers. SANA-WM continues that trend, but with a twist: it’s one of the few models that doesn’t require you to pay per generation or rent cloud time. For indie developers, researchers, or even hobbyists, that’s a big deal. It lowers the barrier to experimenting with video generation, which could lead to some interesting indie games, YouTube content, or even training datasets for other AI models.
The big question now is how this affects the video generation market. Right now, most high-quality video AI still lives in closed ecosystems like Runway or Pika Labs. Those services charge per clip and often limit quality to attract users. SANA-WM doesn’t have those limits—but it also doesn’t have their polish. For now, it’s a tool for people who value control and cost over perfection. If NVIDIA keeps refining it, though, it could start to rival those paid services in quality too.
What You Need to Know
- Source: Hacker News
- Published: May 16, 2026 at 12:06 UTC
- Category: Technology
- Topics: #hackernews · #programming · #tech · #war · #nato · #military
Read the Full Story
This is a curated summary. For the complete article, original data, quotes and full analysis:
All reporting rights belong to the respective author(s) at Hacker News. GlobalBR News summarizes publicly available content to help readers discover the most relevant global news.
Curated by GlobalBR News · May 16, 2026
Related Articles
- Trump Brand’s First Phone Finally Ships After 9-Month Holdup
- NYT Connections Sports Edition Answers & Hints for May 17, #601
- Tesla quietly shelves Solar Roof, bet big on cheap panels
🇧🇷 Resumo em Português
A NVIDIA acaba de lançar uma revolução no mundo da inteligência artificial: o SANA-WM, o primeiro modelo de mundo de vídeo open-source do mundo, capaz de gerar vídeos realistas de até um minuto em 720p a partir de textos simples. Com 2,6 bilhões de parâmetros, a ferramenta democratiza a criação de conteúdos visuais, antes restrita a estúdios e grandes empresas, permitindo que qualquer pessoa com um computador comum possa produzir vídeos de alta qualidade sem custo algum.
O lançamento chega em um momento crucial para o Brasil e o mercado de língua portuguesa, onde a demanda por conteúdos digitais cresce exponencialmente, especialmente com o avanço do marketing digital, das mídias sociais e da educação online. Até então, soluções semelhantes estavam fora do alcance de muitos criadores de conteúdo devido a custos elevados e complexidade técnica. Agora, com o SANA-WM disponível gratuitamente, desenvolvedores, artistas e pequenas empresas brasileiras têm uma ferramenta poderosa para inovar em vídeos explicativos, tutoriais, campanhas publicitárias e até mesmo produções artísticas, impulsionando a criatividade local.
A próxima fronteira é a adaptação desse modelo para o português, garantindo que a tecnologia seja verdadeiramente acessível aos falantes da língua, e a NVIDIA já sinalizou que deve explorar essa vertente em breve.
🇪🇸 Resumen en Español
NVIDIA ha dado un paso revolucionario en el campo de la inteligencia artificial con el lanzamiento de SANA-WM, un modelo de mundo de vídeo de código abierto que promete transformar la generación de contenido audiovisual. Esta herramienta, capaz de producir clips de un minuto en resolución 720p a partir de simples indicaciones de texto, llega en un momento en que la demanda de contenidos digitales crece exponencialmente, especialmente en el mundo hispanohablante donde el acceso a tecnologías avanzadas aún enfrenta barreras económicas y técnicas.
La relevancia de SANA-WM radica no solo en su capacidad técnica, sino en su modelo de distribución abierto y gratuito, que democratiza el acceso a la creación de vídeo de alta calidad. Para los usuarios hispanohablantes, esto significa una oportunidad sin precedentes para desarrollar proyectos creativos, educativos o comerciales sin depender de costosos softwares o plataformas propietarias. Además, al ser de código abierto, fomenta la innovación colaborativa en una región donde el talento local podría verse limitado por recursos económicos, abriendo puertas a nuevas industrias y formas de expresión cultural.
Hacker News
Read full article at Hacker News →This post is a curated summary. All rights belong to the original author(s) and Hacker News.
Was this article helpful?
Discussion