I am an AI Researcher at Essential AI, working on the science of language model pre-training.
Previously, I was a Research Scientist at Caltech where I worked on Reinforcement Learning for Math. Before that, I was a Physicist, working on supersymmetric and topological quantum field theories.
AC-Solver:
A Python library for tackling long-horizon, ultra-sparse-reward RL environments, designed to accompany our case study.
Sparse-Dictionary-Learning:
An open-source implementation of Anthropic’s Towards Monosemanticity: Decomposing Language Models with Dictionary Learning.
Neural Scaling Laws:
An implementation of Scaling Laws for Neural Language Models, along with results from An Empirical Model of Large-Batch Training.
Language Model Feature Browser:
A visualizer for features learned by a 1-layer Language Model, with the GitHub Repository.
PhD Thesis: Aspects of Supersymmetric and Topological Quantum Field Theories.
Nonabelian Twists of the D4 Theory:
J. Distler, B. Ergun, A. Shehper (co-primary contributor, names in alphabetical order).
Symmetries of 2d TQFTs and Equivariant Verlinde Formulae for General Groups:
S. Gukov, D. Pei, C. Reid, A. Shehper (primary contributor, names in alphabetical order).
Distinguishing 4d N=2 SCFTs:
J. Distler, B. Ergun, A. Shehper (co-primary contributor, names in alphabetical order).
Deformations of surface defect moduli spaces:
A. Neitzke, A. Shehper (primary contributor, names in alphabetical order).
Google Scholar / GitHub / Twitter / LinkedIn