Hello, I’m Mayee!
I am a PhD student in Computer Science at Stanford University, advised by Prof. Christopher Ré and part of the Hazy Research Lab.
I’m interested in using a theoretical lens to improve modern machine learning techniques from the data side of things (what is often referred to as data-centric AI). Recently, I’ve been focusing on problems in data selection, data labeling, and data representations, especially in the setting where there are multiple input signals or objectives. Moving forward, I am particularly excited about developing a more principled understanding of how models learn from data.
Previously, I graduated summa cum laude from Princeton University in 2019 with a concentration in Operations Research and Financial Engineering (ORFE) and a certificate in Applications of Computing, where I worked with Prof. Elad Hazan and Prof. Miklos Racz.
Please get in touch with me via email if you would like to chat about research or collaboration!
Publications and Preprints
Skill-it! A data-driven skills framework for understanding and training language models.
Mayee F. Chen, Nicholas Roberts, Kush Bhatia, Jue Wang, Ce Zhang, Frederic Sala, Christopher Ré. Conference on Neural Information Processing Systems (NeurIPS), 2023. Spotlight.
paper | AllenAI talkEmbroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification.
Neel Guha*, Mayee F. Chen*, Kush Bhatia*, Azalia Mirhoseini, Frederic Sala, Christopher Ré. Conference on Neural Information Processing Systems (NeurIPS), 2023.
paperA case for reframing automated medical image classification as segmentation.
Sarah Hooper, Mayee F. Chen, Khaled Kamal Saab, Kush Bhatia, Curtis Langlotz, Christopher Ré. Conference on Neural Information Processing Systems (NeurIPS), 2023.Anomaly Detection with Multiple Reference Datasets
Mayee F. Chen, Benjamin Nachman, Frederic Sala. Journal of High Energy Physics (JHEP), 2023. Machine Learning and the Physical Sciences (ML4PS) Workshop at NeurIPS, 2022.
paper | codeAsk Me Anything: A simple strategy for prompting language models
Simran Arora*, Avanika Narayan*, Mayee F. Chen, Laurel J. Orr, Neel Guha, Kush Bhatia, Ines Chami, Frederic Sala, Christopher Ré. International Conference on Learning Representations (ICLR), 2023. Notable top 25%.
paper | codeReducing Reliance on Spurious Features in Medical Image Classification with Spatial Specificity.
Khaled Saab, Sarah M. Hooper, Mayee F. Chen, Michael Zhang, Daniel Rubin, Christopher Ré. Machine Learning for Healthcare (MLHC), 2022.
paper | codeShoring Up the Foundations: Fusing Model Embeddings and Weak Supervision
Mayee F. Chen*, Daniel Y. Fu*, Dyah Adila, Michael Zhang, Frederic Sala, Kayvon Fatahalian, and Christopher Ré. Uncertainty in Artificial Intelligence (UAI), 2022. Best Student Paper Runner-Up Award, Oral Presentation.
paper | code slides | blog | Snorkel talkPerfectly Balanced: Improving Transfer and Robustness of Supervised Contrastive Learning
Mayee F. Chen*, Daniel Y. Fu*, Avanika Narayan, Michael Zhang, Zhao Song, Kayvon Fatahalian, and Christopher Ré. International Conference on Machine Learning (ICML), 2022.
paper | code | blogTABi: Type-Aware Bi-encoders for End-to-End Entity Retrieval
Megan E. Leszczynski, Daniel Y. Fu, Mayee F. Chen, and Christopher Ré. To Appear in the Findings of the Association for Computational Linguistics (ACL), 2022.
paper | code | blogThe Details Matter: Preventing Class Collapse in Supervised Contrastive Learning
Mayee F. Chen*, Daniel Y. Fu*, Michael Zhang, Kayvon Fatahalian, and Christopher Ré. AAAI Workshop on Artificial Intelligence with Biased or Scarce Data, 2022. Best Paper Award.
paper | codeMandoline: Model Evaluation under Distribution Shift
Mayee F. Chen*, Karan Goel*, Nimit Sohoni*, Fait Poms, Kayvon Fatahalian, and Christopher Ré. International Conference on Machine Learning (ICML), 2021.
paper | code | slides | MedAI talkComparing the Value of Labeled and Unlabeled Data in Method-of-Moments Latent Variable Estimation.
Mayee F. Chen*, Benjamin Cohen-Wang*, Steve Mussmann, Frederic Sala, and Christopher Ré. Artificial Intelligence and Statistics (AISTATS), 2021.
paper | slidesFast and Three-rious: Speeding Up Weak Supervision with Triplet Methods.
Daniel Y. Fu*, Mayee F. Chen*, Frederic Sala, Sarah M. Hooper, Kayvon Fatahalian, and Christopher Ré. International Conference on Machine Learning (ICML), 2020.
paper | code | video | blog
Older
- An Adversarial Model of Network Disruption: Maximizing Disagreement and Polarization in Social Networks.
Mayee F. Chen and Miklos Z. Racz. IEEE Transactions on Network Science and Engineering (TNSE), 2021.
paper | code
Effect of Rotational Grazing on Plant and Animal Production.
Mayee F. Chen and Junping Shi. Journal of Mathematical Biosciences and Engineering, vol. 15, no. 2. 2018.
paper | slidesEfficient GCD Computation for Big Integers on Xeon Phi Coprocessor.
Jie Chen, William Watson, and Mayee F. Chen. IEEE Conference on Networking, Architecture, and Storage (NAS). 2014.
paper | slides