Blog Posts
Revisiting Neural Network Parameterizations for Optimal Performance
May 24, 2026
A modified standard parameterization admits hyperparameter transfer and outperforms μP — over both width and depth.
A Features' Perspective on Neural Scaling Laws
January 15, 2025
Notes of a talk I recently gave on feature superposition (a microscopic phenomenon) and neural scaling laws (a macroscopic phenomenon).