Tag Archives: evaluators

We Wrote the Book on Evaluation Failures. Literally.

by Isaac D. Castillo, Director of Outcomes, Assessment, and Learning at Venture Philanthropy Partners.

Evaluators don’t make mistakes.

Or do they?

Well, actually, they do. In fact, I’ve got a number of fantastic failures under my belt that turned into important learning opportunities. So, when I was asked to share my experience at the MERL Tech DC 2018 session on failure, I jumped at the chance.

Part of the Problem

As someone of Mexican descent, I am keenly aware of the problems that can arise when culturally and linguistically inappropriate evaluation practices are used. However, as a young evaluator, I was often part of the problem.

Early in my evaluation career, I was tasked with collecting data to determine why teenage youth became involved in gangs. In addition to developing the interview guides, I was also responsible for leading all of the on-site interviews in cities with large Latinx populations. Since I am Latinx, I had a sufficient grasp of Spanish to prepare the interview guides and conduct the interviews. I felt confident that I would be sensitive to all of the cultural and linguistic challenges to ensure an effective data collection process. Unfortunately, I had forgotten an important tenet of effective culturally competent evaluation: cultures and languages are not monolithic. Differences in regional cultures or dialects can lead even experienced evaluators into embarrassment, scorn, or the worst outcome of all: inaccurate data.

Sentate, Por Favor

For example, when first interacting with the gang members, I introduced myself and asked them to “Please sit down,” to start the interview by saying “Siéntate, por favor.” What I did not know at the time is that a large portion of the gang members I was interviewing were born in El Salvador or were of Salvadoran descent, and the accurate way to say it using Salvadoran Spanish would have been, “Sentate, por favor.”

Does one word make that much difference? In most cases it did not matter, but it caused several gang members to openly question my Spanish from the outset, which created an uncomfortable beginning to interviews about potentially sensitive subjects.

Amigo or Chero?

I next asked the gang members to think of their “friends.” In most dialects of Spanish, using amigos to ask about friends is accurate and proper. However, in the context of street slang, some gang members prefer the term chero, especially in informal contexts.

Again, was this a huge mistake? No. But it did lead to enough quizzical looks and requests for clarification that started to doubt if I was getting completely honest or accurate answers from some of the respondents. Unfortunately, this error did not arise until I had conducted nearly 30 interviews. I had not thought to test the wordings of the questions in multiple Spanish-speaking communities across several states.

Would You Like a Concha?

Perhaps my most memorable mistake during this evaluation occurred after I had completed an interview with a gang leader outside of a bakery. After we were done, the gang leader called over the rest of his gang to meet me. As I was meeting everyone, I glanced inside the bakery and noticed a type of Mexican pastry that I enjoyed as a child. I asked the gang leader if he would like to go inside and join me for a concha, a round pastry that looks like a shell. Everyone (except me) began to laugh hysterically. The gang leader then let me in on the joke. He understood that I was asking about the pan dulce (sweet bread), but he informed me that in his dialect, concha was used as a vulgar reference to female genitalia. This taught me a valuable lesson about how even casual references or language choices can be interpreted in many different ways.

What did I learn from this?

While I can look back on these mistakes and laugh, I am also reminded of the important lessons learned that I carry with me to this day.

  • Translate with the local context in mind. When translating materials
    or preparing for field work, get a detailed sense of who you will be collecting data from, including what cultures and subgroups people represent and whether or not there are specific topics or words that should be avoided.
  • Translate with the local population in mind. When developing data collection tools (in any language, even if you are fluent in it), take the time to pre-test the language in the tools.

Be okay with your inevitable mistakes. Recognize that no matter how much preparation you do, you will make mistakes in your data collection related to culture and language issues. Remember it is how you respond in those situations that is most important.

As far as failures like this go, it turns out I’m in good company. My story is one of 22 candid, real-life examples from seasoned evaluators that are included in Kylie Hutchinson’s new book, Evaluation Failures: 22 Tales of Mistakes Made and Lessons Learned. Entertaining and informative, I guarantee it will give you plenty of opportunities to reflect and learn.

Big data or big hype: a MERL Tech debate

by Shawna Hoffman, Specialist, Measurement, Evaluation and Organizational Performance at the Rockefeller Foundation.

Both the volume of data available at our fingertips and the speed with which it can be accessed and processed have increased exponentially over the past decade.  The potential applications of this to support monitoring and evaluation (M&E) of complex development programs has generated great excitement.  But is all the enthusiasm warranted?  Will big data integrate with evaluation — or is this all just hype?

A recent debate that I chaired at MERL Tech London explored these very questions. Alongside two skilled debaters (who also happen to be seasoned evaluators!) – Michael Bamberger and Rick Davies – we sought to unpack whether integration of big data and evaluation is beneficial – or even possible.

Before we began, we used Mentimeter to see where the audience  stood on the topic:

Once the votes were in, we started.

Both Michael and Rick have fairly balanced and pragmatic viewpoints; however, for the sake of a good debate, and to help unearth the nuances and complexity surrounding the topic, they embraced the challenge of representing divergent and polarized perspectives – with Michael arguing in favor of integration, and Rick arguing against.

“Evaluation is in a state of crisis,” Michael argued, “but help is on the way.” Arguments in favor of the integration of big data and evaluation centered on a few key ideas:

  • There are strong use cases for integration. Data science tools and techniques can complement conventional evaluation methodology, providing cheap, quick, complexity-sensitive, longitudinal, and easily analyzable data.
  • Integration is possible. Incentives for cross-collaboration are strong, and barriers to working together are reducing. Traditionally these fields have been siloed, and their relationship has been characterized by a mutual lack of understanding of the other (or even questioning of the other’s motivations or professional rigor).  However, data scientists are increasingly recognizing the benefits of mixed methods, and evaluators are seeing the potential to use big data to increase the number of types of evaluation that can be conducted within real-world budget, time and data constraints. There are some compelling examples (explored in this UN Global Pulse Report) of where integration has been successful.
  • Integration is the right thing to do.  New approaches that leverage the strengths of data science and evaluation are potentially powerful instruments for giving voice to vulnerable groups and promoting participatory development and social justice.   Without big data, evaluation could miss opportunities to reach the most rural and remote people.  Without evaluation (which emphasizes transparency of arguments and evidence), big data algorithms can be opaque “black boxes.”

While this may paint a hopeful picture, Rick cautioned the audience to temper its enthusiasm. He warned of the risk of domination of evaluation by data science discourse, and surfaced some significant practical, technical, and ethical considerations that would make integration challenging.

First, big data are often non-representative, and the algorithms underpinning them are non-transparent. Second, “the mechanistic approaches offered by data science, are antithetical to the very notion of evaluation being about people’s values and necessarily involving their participation and consent,” he argued. It is – and will always be – critical to pay attention to the human element that evaluation brings to bear. Finally, big data are helpful for pattern recognition, but the ability to identify a pattern should not be confused with true explanation or understanding (correlation ≠ causation). Overall, there are many problems that integration would not solve for, and some that it could create or exacerbate.

The debate confirmed that this question is complex, nuanced, and multi-faceted. It helped to remind that there is cause for enthusiasm and optimism, at the same time as a healthy dose of skepticism. What was made very clear is that the future should leverage the respective strengths of these two fields in order to maximize good and minimize potential risks.

In the end, the side in favor of integration of big data and evaluation won the debate by a considerable margin.

The future of integration looks promising, but it’ll be interesting to see how this conversation unfolds as the number of examples of integration continues to grow.

Interested in learning more and exploring this further? Stay tuned for a follow-up post from Michael and Rick. You can also attend MERL Tech DC in September 2018 if you’d like to join in the discussions in person!

The future of development evaluation in the age of big data

Screen Shot 2017-07-22 at 1.52.33 PMBy Michael Bamberger, Independent Evaluation Consultant. Michael has been involved in development evaluation for 50 years and recently wrote the report: “Integrating Big Data into the Monitoring and Evaluation of Development Programs” for UN Global Pulse.

We are living in an increasingly quantified world.

There are multiple sources of data that can be generated and analyzed in real-time. They can be synthesized to capture complex interactions among data streams and to identify previously unsuspected linkages among seemingly unrelated factors [such as the purchase of diapers and increased sales of beer]. We can now quantify and monitor ourselves, our houses (even the contents of our refrigerator!), our communities, our cities, our purchases and preferences, our ecosystem, and multiple dimensions of the state of the world.

These rich sources of data are becoming increasingly accessible to individuals, researchers and businesses through huge numbers of mobile phone and tablet apps and user-friendly data analysis programs.

The influence of digital technology on international development is growing.

Many of these apps and other big data/data analytics tools are now being adopted by international development agencies. Due to their relatively low cost, ease of application, and accessibility in remote rural areas, the approaches are proving particularly attractive to non-profit organizations; and the majority of NGOs probably now use some kind of mobile phone apps.

Apps are widely-used for early warning systems, emergency relief, dissemination of information (to farmers, mothers, fishermen and other groups with limited access to markets), identifying and collecting feedback from marginal and vulnerable groups, and permitting rapid analysis of poverty. Data analytics are also used to create integrated data bases that synthesize all of the information on topics as diverse as national water resources, human trafficking, updates on conflict zones, climate change and many other development topics.

Table 1: Widely used big data/data analytics applications in international development

Application

Big data/data analytics tools

Early warning systems for natural and man-made disasters
  • Analysis of Twitter, Facebook and other social media
  • Analysis of radio call-in programs
  • Satellite images and remote sensors
  • Electronic transaction records [ATM, on-line purchases]
Emergency relief
  • GPS mapping and tracking
  • Crowd-sourcing
  • Satellite images
Dissemination of information to small farmers, mothers, fishermen and other traders
  • Mobile phones
  • Internet
Feedback from marginal and vulnerable groups and on sensitive topics
  • Crowd-sourcing
  • Secure hand-held devices [e.g. UNICEF’s “U-Report” device]
Rapid analysis of poverty and identification of low-income groups
  • Analysis of phone records
  • Social media analysis
  • Satellite images [e.g. using thatched roofs as a proxy indicator of low-income households]
  • Electronic transaction records
Creation of an integrated data base synthesizing all the multiples sources of data on a development topic
  • National water resources
  • Human trafficking
  • Agricultural conditions in a particular region


Evaluation is lagging behind.

Surprisingly, program evaluation is the area that is lagging behind in terms of the adoption of big data/analytics. The few available studies report that a high proportion of evaluators are not very familiar with big data/analytics and significantly fewer report having used big data in their professional evaluation work. Furthermore, while many international development agencies have created data development centers within the past few years, many of these are staffed by data scientists (many with limited familiarity with conventional evaluation methods) and there are weak institutional links to agency evaluation offices.

A recent study on the current status of the integration of big data into the monitoring and evaluation of development programs identified a number of reasons for the slow adoption of big data/analytics by evaluation offices:

  • Weak institutional links between data development centers and evaluation offices
  • Differences of methodology and the approach to data generation and analysis
  • Issues concerning data quality
  • Concerns by evaluators about the commercial, political and ethical nature of how big data is generated, controlled and used.

(Linda Raftree talks about a number of other reasons why parts of the development sector may be slow to adopt big data.)

Key questions for the future of evaluation in international development…

The above gives rise to two sets of questions concerning the future role of evaluation in international development:

  • The future direction of development evaluation. Given the rapid expansion of big data in international development, it is likely there will be a move towards integrated program information systems. These will begin to generate, analyze and synthesize data for program selection, design, management, monitoring, evaluation and dissemination. A possible scenario is that program evaluation will no longer be considered a specialized function that is the responsibility of a separate evaluation office, rather it will become one of the outputs generated from the program data base. If this happens, evaluation may be designed and implemented not by evaluation specialists using conventional evaluation methods (experimental and quasi-experimental designs, theory-based evaluation) but by data analysts using methods such as predictive analytics and machine learning.

Key Question: Is this scenario credible? If so how widespread will it become and over what time horizon? Is it likely that evaluation will become one of the outputs of an integrated management information system? And if so is it likely that many of the evaluation functions will be taken over by big data analysts?

  • The changing role of development evaluators and the evaluation office. We argued that currently many or perhaps most development evaluators are not very familiar with big data/analytics, and even fewer apply these approaches. There are both professional reasons (how evaluators and data scientists are trained) and organizational reasons (the limited formal links between evaluation offices and data centers in many organizations) that explain the limited adoption of big data approaches by evaluators. So, assuming the above scenario proves to be at least partially true, what will be required for evaluators to become sufficiently conversant with these new approaches to be able to contribute to how big data/focused evaluation approaches are designed and implemented? According to Pete York at Communityscience.com, the big challenge and opportunity for evaluators is to ensure that the scientific method becomes an essential part of the data analytics toolkit. Recent studies by the Global Environmental Faciity (GEF) illustrate some of the ways that big data from sources such as satellite images and remote sensors can be used to strengthen conventional quasi-experimental evaluation designs. In a number of evaluations these data sources used propensity score matching to select matched samples for pretest-posttest comparison group designs to evaluate the effectiveness of programs to protect forest cover or reserves for mangrove swamps.

Key Question: Assuming there will be a significant change in how the evaluation function is organized and managed, what will be required to bridge the gap between evaluators and data analysts? How likely is it that the evaluators will be able to assume this new role and how likely is it that organizations will make the necessary adjustments to facilitate these transformations?

What do you think? How will these scenarios play out?

Note: Stay tuned for Michael’s next post focusing on how to build bridges between evaluators and big data analysts.

Below are some useful references if you’d like to read more on this topic:

Anderson, C (2008) “The end of theory: The data deluge makes the scientific method obsolete” Wired Magazine 6/23/08. The original article in the debate on whether big data analytics requires a theoretical framework.

Bamberger, M., Raftree, L and Olazabal, V (2016) The role of new information and communication technologies in equity–focused evaluation: opportunities and challenges. Evaluation. Vol 22(2) 228–244 . A discussion of the ethical issues and challenges with new information technology

Bamberger, M (2017) Integrating big data into the monitoring and evaluation of development programs. UN Global Pulse with support from the Rockefeller Foundation. Review of progress in the incorporation of new information technology into development programs and the opportunities and challenges of building bridges between evaluators and big data specialists.

Meier , P (2015) Digital Humanitarians: How big data is changing the face of humanitarian response. CRC Press. A review, with detailed case studies, of how digital technology is being used by NGOs and civil society.

O’Neill, C (2016) The weapons of math destruction: How big data increases inequality and threatens democracy.   How widely-used digital algorithms negatively affect the poor and marginalized sectors of society. Crown books.

Petersson, G.K and Breul, J.D (editors) (2017) Cyber society, big data and evaluation. Comparative policy evaluation. Volume 24. Transaction Publications. The evolving role of evaluation in cyber society.

Wolf, G The quantified self [TED Talk]  Quick overview of the multiple self-monitoring measurements that you can collect on yourself.

World Bank (2016). Digital Dividends. World Development Report. Overview of how the expansion of digital technology is affecting all areas of our lives.