Project AI: Do Feature-Additive Explanation Methods Agree?
Report written for the ‘Project AI’ course Michael Neely and I did on eXplainable AI. This course was done during our time in the MSc AI program at the Unive...
I'm a PhD candidate at the Vrije Universiteit Amsterdam (CLTL), supervised by Piek Vossen, Peter Bloem, and Ilia Markov.
My main research interests involve evaluating whether models have strong capabilities like understanding, reasoning, intentionality, etc. I enjoy trying to make precise what exactly such capabilities can look like, taking into account philosophy to understand different ways these terms are used. In practice, I try to tackle these questions as problems of (mechanistic) interpretability. Currently, my focus is on understanding how LLMs represent whether to treat a sentence as true/false, and how this interacts with (in-context) information through reasoning.
Report written for the ‘Project AI’ course Michael Neely and I did on eXplainable AI. This course was done during our time in the MSc AI program at the Unive...
[Mirror of post on Medium.] For the past year or so I have been involved in a project systematically comparing feature-additive eXplainable AI (XAI) methods...
Our paper which was presentation at the ICML Workshop on Theoretic Foundation, Criticism, and Application Trend of Explainable AI.