The main content of my PhD-research involves modelling AI evaluation as a prediction problem.
In general I am interested in AI evaluation & everything that is related: testing, auditing, metrics, environment & benchmark design, capability measurement, etc.
With regards to more specific applications and domains, I strongly prefer sequential decision problems, RL, planning & control and the likes. Specifically with relation to world-model learning, goal conditioning, multi-task systems. Due to circumstances, I have mostly worked with large language models, but eager to move on.
Other concepts that spark my imagination include causality, embodied cognition, web agents, knowledge representation, grounding, philosophy of cognition, artificial life.
Most things really.