📨 wschell@vrain.upv.es📜 Google Scholar🧑‍💻 GitHub🟢 ORCiD🗺️ Valencian Research Institute for Artificial Intelligence (VRAIN)

Supervised by: 🧙🏼‍♂️ José Hernández-Orallo and 🧙🏼 FernandoMartínez-Plumed

Interests

The main content of my PhD-research involves modelling AI evaluation as a prediction problem.

In general I am interested in AI evaluation & everything that is related: testing, auditing, metrics, environment & benchmark design, capability measurement, etc.

With regards to more specific applications and domains, I strongly prefer sequential decision problems, RL, planning & control and the likes. Specifically with relation to world-model learning, goal conditioning, multi-task systems. Due to circumstances, I have mostly worked with large language models, but eager to move on.

Other concepts that spark my imagination include causality, embodied cognition, web agents, knowledge representation, grounding, philosophy of cognition, artificial life.

Most things really.

Papers

2023Animal-AI 3: What's New & Why You Should Care
Konstantinos Voudouris, Ibrahim Alhas, Wout Schellaert, Matthew Crosby, Joel Holmes, John Burden, Niharika Chaubey, Niall Donnelly, Matishalin Patel, Marta Halina, José Hernández-Orallo, Lucy G. Cheke
arXiv [paper]
2023Rethink Reporting of Evaluation Results in AI
Ryan Burnell, Wout Schellaert, John Burden, Tomer D. Ullman, Fernando Martínez-Plumed, Joshua B. Tenenbaum, Danaja Rutar, Lucy G. Cheke, Jascha Sohl-Dickstein, Melanie Mitchell, Douwe Kiela, Murray Shanahan, Ellen M. Voorhees, Anthony G. Cohn, Joel Z. Leibo, José Hernández-Orallo
Science [paper, preprint]
2023Your Prompt is My Command: On Assessing the Human-Centred Generality of Multimodal Models
Wout Schellaert, Fernando Martínez-Plumed, Karina Vold, John Burden, Pablo A. M. Casares, Bao Sheng Loe, Roi Reichart, Sean Ó hÉigeartaigh, Anna Korhonen, José Hernández-Orallo
JAIR: AI and Society [paper]
2022Reject Before You Run: Granular Performance Prediction for Big Language Models with Small External Assessors
Lexin Zhou, Fernando Martínez-Plumed, José Hernández-Orallo, Cèsar Ferri, Wout Schellaert
Workshop on Evaluation Beyond Metrics at IJCAI 2022 [paper, workshop]
2022Training on the Test Set: Mapping the System-Problem Space in AI
José Hernández-Orallo*, Wout Schellaert*, Fernando Martínez-Plumed* (*equal contribution)
Blue Sky Idea Award 🏆
AAAI 2022 [paper, award]

Other

2023
Co-organising the “Predictable AI” kick-off event in Valencia

A singular event consisting of invited talks, panels and short lightning talks. It discussed “Predictable AI Futures” dealing with topics such as scaling laws, control, liability and future risks; as well as “Predictable AI Systems”, covering cognitive and robust evaluation, assessors, co-operative conditions, uncertainty estimation, and much more. (site)

Committee: José Hernández-Orallo, Ana Cidad and many others from ValGRAI in Valencia and the LCFI and CSER in Cambridge.

2022
Co-organising the “Evaluation Beyond Metrics” workshop at IJCAI22

Workshop with the goal to challenge the widespread approach of evaluating intelligent systems with aggregated metrics over a benchmark or distribution of tasks. (site)

Committee: Wout Schellaert, Joshua Tenenbaum, Lucy Cheke, Tomer Ullman, José Hernández-Orallo, José Hernández-Orallo, Danaja Rutar, John Burden and Ryan Burnell.