← Home

Blog Posts

Revisiting Neural Network Parameterizations for Optimal Performance

May 24, 2026

A modified standard parameterization admits hyperparameter transfer and outperforms μP — over both width and depth.

A Features' Perspective on Neural Scaling Laws

January 15, 2025

Notes of a talk I recently gave on feature superposition (a microscopic phenomenon) and neural scaling laws (a macroscopic phenomenon).