1. Home
  2. » 2025-09-30
  3. » Optimization

Geometric Breakthroughs for Neural Network Stability

This original research introduces an innovative method to tackle numerical instability and enhance learning in large neural networks by constraining weight matrices to submanifolds. It details manifold-based approaches and presents the Manifold Muon, a newly developed optimizer that showcased improved performance over existing algorithms in initial experiments. The framework extends to 'Modular Manifolds,' enabling principled, layer-wise learning rate budgeting, thereby promising more robust and automated training mechanisms.

calendar_today 2025-09-26 attribution thinkingmachines.ai/blog/

Modular Manifolds

Training large neural networks demands healthy tensors to prevent numerical instability and improve learning. This post introduces a compelling approach: constraining weight matrices to submanifolds, enabling co-designed optimization algorithms. It details manifold optimization, particularly using the Stiefel manifold for weights, and presents the Manifold Muon optimizer, which outperformed AdamW in small experiments. The concept extends to "Modular Manifolds," an abstraction for principled, layer-wise learning rate budgeting based on Lipschitz sensitivity. This framework promises more robust and automatic neural network training, opening various research avenues.
Good summary?