My research interests focus on understanding, controlling, and enhancing AI models. I develop interpretability tools that automatically discover and explain the internal operations of machine-learning models and use gained insights to control model behavior, enhance performance, and prevent undesired outcomes.
Office: 45-733D
Office hours: I'm hosting pro bono office hours, here are more details.
[April 2024] We introduce MAIA, a Multimodal Automated Interpretability Agent that solves interpretability tasks by iteratively designing experiments on other AI systems.
[Feb 2024] Why does fine-tuning LLMs work so well? Our ICLR'24 paper reveals it's not about introducing new mechanisms but enhancing the existing ones!
[Jan 2020] I gave a talk about SinGAN at the Israeli Computer Vision day
[Nov 2019] SinGAN won ICCV’19 Best Paper Award (Marr Prize)!
[Aug 2019] I participated in the Google Student Retreat at London, for Women Techmakers Scholars (now called Generation Google Scholarship), and met an amazing group of women from all over Europe, the Middle East and Africa
Inspired by Krishna Murthy and Wei-Chiu Ma, I dedicate 1-2 hours each week to providing guidance and mentorship to students from underrepresented groups, or to anyone who needs it. Specifically, if you're searching for a postdoc position, I'd be happy to share insights from my experience.
Please fill out this form to contact me.