Ali Shehper


View My GitHub Profile

Ali Shehper

I am a Physicist, currently researching the mechanistic interpretability of AI systems and the use of AI systems to solve Math problems. I am currently employed at Rutgers University.

My current research interests include:

My research is inspired by the scaling of AI models and their capabilities in the recent years. See my implementation of the first scaling laws paper, which indicates that the optimal scaling of the size of langauge models is independent of the choice of training dataset (when keeping hyperparameters and the choice of tokenizer fixed). It is known that this scaling law depends on various hyperparameters, courtesy of the Chinchilla scaling laws.

During my PhD at UT Austin, I studied theoretical aspects of quantum field theories and used them to discover new results in Math. My PhD thesis is available here, and a list of my research papers is available here.

Please feel free to send a message if you would like to chat.

Google Scholar / GitHub / Twitter / LinkedIn