Hello, I’m Mayee!

I am a PhD student in Computer Science at Stanford University, advised by Prof. Christopher Ré and part of the Hazy Research Lab.

I’m interested in studying and improving the fundamentals of modern machine learning through data (often known as data-centric AI). On the model training side, I work on data mixing, synthetic data, data representations, and data labeling. On the inference side, I work on test-time algorithms to produce higher-quality model generations, such as ensembling and routing. Currently, I am thinking about how to develop and operationalize a more principled understanding of how models learn from data (what skills does data teach the model? Does it matter if the data is synthetic or real?)

Previously, I graduated summa cum laude from Princeton University with a concentration in Operations Research and Financial Engineering (ORFE) and a certificate in Applications of Computing, where I worked with Prof. Elad Hazan and Prof. Miklos Racz.

Please get in touch with me via email if you would like to chat about research or collaboration!

Publications and Preprints


For a chronological order of my publications, please check out my Google Scholar/CV.

Training Data

Test-time Improvements

Data Labeling

Data Representations

Science/Health Applications

Model evaluation

Older