NVIDIA’s AI Models at ICLR 2025 Are Redefining the Future of Technology
SINGAPORE, 25TH APRIL, 2025-NVIDIA has unveiled a suite of groundbreaking multimodal generative AI models at the International Conference on Learning Representations (ICLR) 2025, signaling a significant advancement in AI capabilities across various industries.
Among the notable innovations is Fugatto, an audio generative AI model capable of generating or transforming complex audio compositions from prompts that combine text and audio files.
This model enables the creation of intricate soundscapes, including music and voice, from minimal input.
In the realm of robotics, the HAMSTER model introduces a hierarchical design for vision-language-action models, enhancing robots’ ability to transfer knowledge from off-domain fine-tuning data to real-world tasks. This approach reduces the reliance on expensive, hardware-specific training data.
For video understanding, LongVILA presents a training pipeline that efficiently handles long video sequences, achieving state-of-the-art performance across multiple benchmarks.
This advancement addresses the computational challenges associated with training AI models on extended video content.
Additionally, NVIDIA’s Hymba model employs a hybrid architecture combining transformer and state space models, resulting in improved throughput and reduced memory usage without compromising performance.
“ICLR is one of the world’s most impactful AI conferences, where researchers introduce important technical innovations that move every industry forward,”
said Bryan Catanzaro, Vice President of Applied Deep Learning Research at NVIDIA.
These developments underscore NVIDIA’s commitment to advancing AI research and its practical applications across diverse sectors.
GenAI Surge Drives 59.2% CAGR in Asia’s AI Growth
Tuya’s New AI Engines Could Disrupt Every Smart Device You Own