I am an AI Researcher at Essential AI, working on the science of language model pre-training.
Previously, I was a Research Scientist at Caltech, focusing on Reinforcement Learning for Math, and before that a Physicist studying supersymmetric and topological quantum field theories.
AC-Solver A Python library for tackling long-horizon, ultra-sparse-reward RL environments, designed to accompany our case study.
Sparse-Dictionary-Learning An open-source implementation of Anthropic’s Towards Monosemanticity: Decomposing Language Models with Dictionary Learning.
Neural Scaling Laws An implementation of Scaling Laws for Neural Language Models, along with results from An Empirical Model of Large-Batch Training.
Language Model Feature Browser A visualizer for features learned by a 1-layer Language Model, with the GitHub Repository.
PhD Thesis: Aspects of Supersymmetric and Topological Quantum Field Theories.
Nonabelian Twists of the D4 Theory J. Distler, B. Ergun, A. Shehper (equal contribution with B. Ergun, alphabetical order).
Symmetries of 2d TQFTs and Equivariant Verlinde Formulae for General Groups S. Gukov, D. Pei, C. Reid, A. Shehper (lead contributor, alphabetical order).
Distinguishing 4d N=2 SCFTs J. Distler, B. Ergun, A. Shehper (equal contribution with B. Ergun, alphabetical order).
Deformations of surface defect moduli spaces A. Neitzke, A. Shehper (lead contributor, alphabetical order).
Google Scholar / GitHub / Twitter / LinkedIn