From Dense to Dynamic: NVIDIA’s Innovations in Upcycling LLMs to Sparse MoE | Synced

In a new paper Upcycling Large Language Models into Mixture of Experts, an NVIDIA research team introduces a new “virtual group” initialization technique to facilitate the transition of...

By Storm Warden · March 16, 2026 · 1 min read

ai
machine learning & data science
research
ai
artificial intelligence

Source: Synced | AI Technology & Industry Review

In a new paper Upcycling Large Language Models into Mixture of Experts, an NVIDIA research team introduces a new “virtual group” initialization technique to facilitate the transition of dense models into fine-grained MoE structures.