July 13-15: Introduction to AI in Evaluation – IPDET Workshop

The IPDET Workshops in Bern (Switzerland) will take place from July 13 to 17, 2026. Linda Raftree (The MERL Tech Initiative) is co-facilitating a training with Paul Jasper to help evaluators who have so far had limited exposure to emerging AI, including Generative AI (GenAI), to understand potential implications and risks for their area of work. Learn morehere.

Date: July 13 – 15, 2026
Level: Introductory
Recommended for: Commissioners, Evaluators, Practitioners

Course Description

This course is motivated by an increasing proliferation of AI use in evaluations. Since the release of ChatGPT in late 2022, large language models (LLMs) have been changing the way in which we interact with text and human language – including when doing research and carrying out evaluations. In fact, the increasing capabilities of these models mean that potential use-cases have been expanding. More and more areas of our work are likely to be affected by these models.

The overarching objective of this course is to help evaluators who have so far had limited exposure to emerging AI, including Generative AI (GenAI), to understand potential implications and risks for their area of work.

The course will be structured into three modules:

  1. Introduction to emerging AI in evaluation (4 hours): This module will cover basic questions such as: What is AI and what specifically is Generative AI (GenAI)? How does it work? Who is behind this technology? What potential benefits and practical/ethical risks are there when using emerging AI for evaluation? How and why does GenAI differ from other technologies used for evaluation, including other types of AI like machine learning, big data, and predictive analytics? In practical exercises, participants will be able to experience the use of GenAI for simple text-based tasks, with the objective of understanding its basic strengths and weaknesses.
  1. How can AI be used for evaluation (8 hours)? This module will delve deeper into the potential uses of AI for evaluation. Based on a typical evaluation workflow, we will explore where the different capabilities of emerging AI might support evaluators with specific tasks. We will identify where the benefits as well as the practical and higher level ethical risks and challenges of using emerging AI might be greatest.

Practical exercises will focus on specific evaluation tasks that emerging AI could potentially help with. Participants will identify common these tasks and identify different GenAI tools that could support with them. Through hands-on exercises we will try out the tools and compare results in order to analyse how the tools might enable efficiencies or quality improvements and revealing challenges such as bias and quality issues. We will discuss ways to mitigate the risks of GenAI use as well as whether AI should be integrated at all, depending on an assessment of its real benefit in context and its potential for harm.

  1. What to look out for when assessing the use of AI (8 hours): In this final module, we will change our perspective to consider situations where AI is not a tool to be used for evaluation, but rather is the object to be evaluated or assessed. We will look at two specific situations: First, what frameworks can be applied to evaluation of AI -enabled programmes? Second, when commissioning evaluations, how can we assess a proposal that contains AI-enabled methods or approaches? We will introduce frameworks and toolkits that can be used in these situations. Practical exercises will focus on scenarios related to the above situations so that participants can experience what it means to be assessing the use of AI.

Workshop Objectives:

By the end of the course, participants will be able to:

  • Understand and critically assess the basics of AI, how it works, and what its strengths and weaknesses and ethical challenges are.
  • Understand where – in an evaluation workflow – AI can be used safely and where risks of misuse and biased results are greatest.
  • Identify AI tools that can help evaluators conduct evaluations.
  • Understand basic frameworks that can be used to evaluate AI tools.
  • Access toolkits that can help assess the proposed use of AI in evaluations.

Recommended for:

  • Evaluators who have not previously been exposed to AI in their evaluation practice – or who have had limited exposure and who want to understand the use of AI better.
  • Evaluators tasked with evaluating AI-enabled programmes.
  • Evaluation commissioners who need to deal with AI in their work.

Level

Beginner level. Participants should be familiar with the basics of evaluation methodology, ideally mixed-methods and theory-based evaluations (though no specific methodological specification is required). No previous exposure or understanding of AI generally or AI in evaluations is required.

Prerequisites:

Instructors may ask participants to install or register for an AI tool of their choice before the workshop. Instructions will be provided pre-workshop.

Instructors:

Linda Raftree: Linda Raftree founded the MERL Tech Initiative (MTI) in 2014, building on two decades of work at the intersection of community development, gender, youth participatory media, rights-based approaches and digital development. She is a well-known expert on responsible data approaches, AI policy and governance, inclusive digital approaches, safe tech-enabled programme design, and tech-enabled monitoring, evaluation, research and learning (MERL). Linda specialized in organizational and sector strategy in times of political flux and technological change. She runs sector-wide convenings through MTI and the New York City Technology Salon, which she started in 2011. Linda is a Certified Information Privacy Professional (CIPP & CIPM). She also serves on the SAFE AI Pool of Experts and co-teaches AI and MERL at Columbia University’s School of International and Public Affairs (SIPA).

Paul Jasper: As Senior Technical Lead, Paul Jasper leads Oxford Policy Management’s (OPM) Quantitative Methods and Data Science team. Paul is based in London, out of OPM’s UK office. He is an evaluator with over 15 years of experience, working in evaluations covering health, education, social protection, and a variety of other sectors and policy areas. His methodological focus is on quantitative and mixed methods approaches, as well as the use of Data Science and innovative data approaches, including AI, in evaluations.