The rapid growth of Artificial Intelligence—computers behaving like humans, and performing tasks which people usually carry out—promises to transform everything from car travel to personal finance. But how will it affect the equally vital field of M&E? As evaluators, most of us hate paper-based data collection—and we know that automation can help us process data more efficiently. At the same time, we’re afraid to remove the human element from monitoring and evaluation: What if the machines screw up?
Over the past year, Souktel has worked on three areas of AI-related M&E, to determine where new technology can best support project appraisals. Here are our key takeaways on what works, what doesn’t, and what might be possible down the road.
Natural Language Processing
For anyone who’s sifted through thousands of Excel entries, natural language processing sounds like a silver bullet: This application of AI interprets text responses rapidly, often matching them against existing data sets to find trends. No need for humans to review each entry by hand! But currently, it has two main limitations: First, natural language processing works best for sentences with simple syntax. Throw in more complex phrases, or longer text strings, and the power of AI to grasp open-ended responses goes downhill. Second, natural language processing only works for a limited number of (mostly European) languages—at least for now. English and Spanish AI applications? Yes. Chichewa or Pashto M&E bots? Not yet. Given these constraints, we’ve found that AI apps are strongest at interpreting basic misspelled answer text during mobile data collection campaigns (in languages like English or French). They’re less good at categorizing open-ended responses by qualitative category (positive, negative, neutral). Yet despite these limitations, AI can still help evaluators save time.
AI does a decent job of telling objects apart; we’ve leveraged this to build mobile applications which track supply delivery more quickly & cheaply. If a field staff member submits a photo of syringes and a photo of bandages from their mobile, we don’t need a human to check “syringes” and “bandages” off a list of delivered items. The AI-based app will do that automatically—saving huge amounts of time and expense, especially during crisis events. Still, there are limitations here too: While AI apps can distinguish between a needle and a BandAid, they can’t yet tell us whether the needle is broken, or whether the BandAid is the exact same one we shipped. These constraints need to be considered carefully when using AI for inventory monitoring.
Comparative Facial Recognition
This may be the most exciting—and controversial—application of AI. The potential is huge: “Qualitative evaluation” takes on a whole new meaning when facial expressions can be captured by cameras on mobile devices. On a more basic level, we’ve been focusing on solutions for better attendance tracking: AI is fairly good at determining whether the people in a photo at Time A are the same people in a photo at Time B. Snap a group pic at the end of each community meeting or training, and you can track longitudinal participation automatically. Take a photo of a larger crowd, and you can rapidly estimate the number of attendees at an event.
However, AI applications in this field have been notoriously bad at recognizing diversity—possibly because they draw on databases of existing images, and most of those images contain…white men. New MIT research has suggested that “since a majority of the photos used to train [AI applications] contain few minorities, [they] often have trouble picking out those minority faces”. For the communities where many of us work (and come from), that’s a major problem.
Do’s and Don’ts
So, how should M&E experts navigate this imperfect world? Our work has yielded a few “quick wins”—areas where Artificial Intelligence can definitely make our lives easier: Tagging and sorting quantitative data (or basic open-ended text), simple differentiation between images and objects, and broad-based identification of people and groups. These applications, by themselves, can be game-changers for our work as evaluators—despite their drawbacks. And as AI keeps evolving, its relevance to M&E will likely grow as well. We may never reach the era of robot focus group facilitators—but if robo-assistants help us process our focus group data more quickly, we won’t be complaining.
by Alvaro Cobo-Santillan, Catholic Relief Services (CRS); Jeff Lundberg, CRS; Paul Perrin, University of Notre Dame; and Gillian Kerr, LogicalOutcomes Canada.
In the year 2017, with all of us holding a mini-computer at all hours of the day and night, it’s probably not too hard to imagine that “A teenager in Africa today has access to more information than the President of United States had 15 years ago”. So it also stands to reason that the ability to appropriately and ethically grapple with the use of that immense amount information has grown proportionately.
What do we mean when we say that the world of development—particularly evaluation—data is murky? A major factor in this sentiment is the ambiguous polarity between research and evaluation data.
“Research seeks to prove; evaluation seeks to improve.” – CDC
“Research studies involving human subjects require IRB review. Evaluative studies and activities do not.”
This has led to debates as to the actual relationship between research and evaluation. Some see them as related, but separate activities, others see evaluation as a subset of research, and still others might posit that research is a specific case of evaluation.
But regardless, though motivations of the two may differ, research and evaluation look the same due to their stakeholders, participants, and methods.
If that statement is true, then we must hold both to similar protections!
What are some ways to make the waters less murky?
Deeper commitment to informed consent
Reasoned use of identifiers
Need to know vs. nice to know
Data security and privacy protocols
Data use agreements and protocols for outside parties
Revisit NGO primary and secondary data IRB requirements
Alright then, what can we practically do within our individual agencies to move the needle on data protection?
In short, governance. Responsible data is absolutely a crosscutting responsibility, but can be primarily championed through close partnerships between the M&E and IT Departments
Think about ways to increase usage of digital M&E – this can ease the implementation of R&D
Can existing agency processes and resources be leveraged?
Plan and expect to implement gradual behavior change and capacity building as a pre-requisite for a sustainable implementation of responsible data protections
Think in an iterative approach. Gradually introduce guidelines, tools and training materials
Plan for business and technical support structures to support protections
Is anyone doing any of the practical things you’ve mentioned?
Yes! Gillian Kerr from LogicalOutcomes spoke about highlights from an M&E system her company is launching to provide examples of the type of privacy and security protections they are doing in practice.
As a basis for the mindset behind their work, she notably presented a pretty fascinating and simple comparison of high risk vs. low risk personal information – year of birth, gender, and 3 digit zip code is unique for .04% of US residents, but if we instead include a 5 digit zip code over 50% of US residents could be uniquely identified. Yikes.
In that vein, they are not collecting names or identification and only year of birth (not month or day) and seek for minimal sensitive data defining data elements by level of risk to the client (i.e. city of residence – low, glucose level – medium, and HIV status – high).
In addition, asking for permission not only in the original agency permission form, but also in each survey. Their technical system maintains two instances – one containing individual level personal information with tight permission even for administrators and another with aggregated data with small cell sizes. Other security measures such as multi-factor authentication, encryption, and critical governance; such as regular audits are also in place.
It goes without saying that we collectively have ethical responsibilities to protect personal information about vulnerable people – here are final takeaways:
If you can’t protect sensitive information, don’t collect it.
If you can’t keep up with current security practices, outsource your M&E systems to someone who can.
Your technology roadmap should aspire to give control of personal information to the people who provide it (a substantial undertaking).
In the meantime, be more transparent about how data is being stored and shared
by Maliha Khan, a development practitioner in the fields of design, measurement, evaluation and learning. Maliha led the Maturity Model sessions at MERL Tech DC and Linda Raftree, independent consultant and lead organizer of MERL Tech.
MERL Tech is a platform for discussion, learning and collaboration around the intersection of digital technology and Monitoring, Evaluation, Research, and Learning (MERL) in the humanitarian and international development fields. The MERL Tech network is multidisciplinary and includes researchers, evaluators, development practitioners, aid workers, technology developers, data analysts and data scientists, funders, and other key stakeholders.
One key goal of the MERL Tech conference and platform is to bring people from diverse backgrounds and practices together to learn from each other and to coalesce MERL Tech into a more cohesive field in its own right — a field that draws from the experiences and expertise of these various disciplines. MERL Tech tends to bring together six broad communities:
traditional M&E practitioners, who are interested in technology as a tool to help them do their work faster and better;
development practitioners, who are running ICT4D programs and beginning to pay more attention to the digital data produced by these tools and platforms;
business development and strategy leads in organizations who want to focus more on impact and keep their organizations up to speed with the field;
tech people who are interested in the application of newly developed digital tools, platforms and services to the field of development, but may lack knowledge of the context and nuance of that application
data people, who are focused on data analytics, big data, and predictive analytics, but similarly may lack a full grasp of the intricacies of the development field
donors and funders who are interested in technology, impact measurement, and innovation.
Since our first series of Technology Salons on ICT and M&E in 2012 and the first MERL Tech conference in 2014, the aim has been to create stronger bridges between these diverse groups and encourage the formation of a new field with an identity of its own — In other words, to move people beyond identifying as, say, an “evaluator who sometimes uses technology,” and towards identifying as a member of the MERL Tech space (or field or discipline) with a clearer understanding of how these various elements work together and play off one another and how they influence (and are influenced by) the shifts and changes happening in the wider ecosystem of international development.
By building and strengthening these divergent interests and disciplines into a field of their own, we hope that the community of practitioners can begin to better understand their own internal competencies and what they, as a unified field, offered to international development. This is a challenging prospect, as beyond their shared use of technology to gather, analyze, and store data and an interest in better understanding how, when, why, where, (etc.) these tools work for MERL and for development/humanitarian programming, there aren’t many similarities between participants.
At the MERL Tech London and MERL Tech DC conferences in 2017, we made a concerted effort to get to the next level in the process of creating a field. In London in February, participants created a timeline of technology and MERL and identified key areas that the MERL Tech community could work on strengthening (such as data privacy and security frameworks and more technological tools for qualitative MERL efforts). At MERL Tech DC, we began trying to understand what a ‘maturity model’ for MERL Tech might look like.
What do we mean by a ‘maturity model’?
Broadly, maturity models seek to qualitatively assess people/culture, processes/structures, and objects/technology to craft a predictive path that an organization, field, or discipline can take in its development and improvement.
Initially, we considered constructing a “straw” maturity model for MERL Tech and presenting it at the conference. The idea was that our straw model’s potential flaws would spark debate and discussion among participants. In the end, however, we decided against this approach because (a) we were worried that our straw model would unduly influence people’s opinions, and (b) we were not very confident in our own ability to construct a good maturity model.
Instead, we opted to facilitate a creative space over three sessions to encourage discussion on what a maturity model might look like, and what it might contain. Our vision for these sessions was to get participants to brainstorm in mixed groups containing different types of people- we didn’t want small subsets of participants to create models independently without the input of others.
In the first session, “Developing a MERL Tech Maturity Model”, we invited participants to consider what a maturity model might look like. Could we begin to imagine a graphic model that would enable self-evaluation and allow informed choices about how to best develop competencies, change and adjust processes and align structures in organizations to optimize using technology for MERL or indeed other parts of the development field?
In the second session, “Where do you sit on the Maturity Model?” we asked participants to use the ideas that emerged from our brainstorm in the first session to consider their own organizations and work, and compare them against potential maturity models. We encouraged participants to assess themselves using green (young sapling) to yellow (somewhere in the middle) and red (mature MERL Tech ninja!) and to strike up a conversation with other people in the breaks on why they chose that color.
In our third session, “Something old, something new”, we consolidated and synthesized the various concepts participants had engaged with throughout the conference. Everyone was encouraged to reflect on their own learning, lessons for their work, and what new ideas or techniques they may have picked up on and might use in the future.
The Maturity Models
As can be expected, when over 300 people take marker and crayons to paper, many a creative model emerges. We asked the participants to gallery walk the models over the next day during the breaks and vote on their favorite models.
We won’t go into detail of what all the 24 the models showed, but there were some common themes that emerged from the ones that got the most votes – almost all maturity models include dimensions (elements, components) and stages, and a depiction of potential progression from early stages to later stages across each dimension. They all also showed who the key stakeholders or players were, and some had some details on what might be expected of them at different stages of maturity.
Two of the models (MERLvana and the Data Appreciation Maturity Model – DAMM) depicted the notion that reaching maturity was never really possible and the process was an almost infinite loop. As the presenters explained MERLvana “it’s an impossible to reach the ideal state, but one must keep striving for it, in ever closer and tighter loops with fewer and fewer gains!”
“MERL-tropolis” had clearly defined categories (universal understanding, learning culture and awareness, common principles, and programmatic strategy) and the structures/ buildings needed for those (staff, funding, tools, standard operating procedures, skills).
The most popular was “The Data Turnpike” which showed the route from the start of “Implementation with no data” to the finish line of “Technology, capacity and interest in data and adaptive management” with all the pitfalls along the way (misuse, not timely, low ethics etc) marked to warn of the dangers.
As organizers of the session, we found the exercises both interesting and enlightening, and we hope they helped participants to begin thinking about their own MERL Tech practice in a more structured way. Participant feedback on the session was on polar extremes. Some people loved the exercise and felt that it allowed them to step back and think about how they and their organization were approaching MERL Tech and how they could move forward more systematically with building greater capacities and higher quality work. Some told us that they left with clear ideas on how they would work within their organizations to improve and enhance their MERL Tech practice, and that they had a better understanding of how to go about that. A few did not like that we had asked them to “sit around drawing pictures” and some others felt that the exercise was unclear and that we should have provided a model instead of asking people to create one. [Note: This is an ongoing challenge when bringing together so many types of participants from such diverse backgrounds and varied ways of thinking and approaching things!]
We’re curious if others have worked with “maturity models” and if they’ve been applied in this way or to the area of MERL Tech before. What do you think about the models we’ve shared? What is missing? How can we continue to think about this field and strengthen our theory and practice? What should we do at MERL Tech London in March 2018 and beyond to continue these conversations?
by Amanda Makulec, Data Visualization Lead, Excella Consulting and Barb Knittel, Research, Monitoring & Evaluation Advisor, John Snow Inc. Amanda and Barb led “How the Simpsons Make Data Use Happen” at MERL Tech DC.
Workshopping ways to make data use happen.
Human centered design isn’t a new concept. We’ve heard engineers, from aerospace to software, quietly snicker as they’ve seen the enthusiasm for design thinking explode within the social good space in recent years. “To start with the end user in mind? Of course! How else would you create a product someone wants to use?”
However, in our work designing complex health information systems, dashboards, and other tools and strategies to improve data use, the idea of starting with the end user does feel relatively new.
Thinking back to graduate school nearly ten years ago, dashboard design classes focused on the functional skills, like how to use a pivot table in Excel, not on the complex processes of gathering user requirements to design something that could not only delight the end user, but be co-designed with them.
As part of designing for data use and data visualization design workshops, we’ve collaborated with design firms to find new ways to crack the nut of developing products and processes that help decisionmakers use information. Using design thinking tools like ranking exercises, journey maps, and personas has helped users identify and find innovative ways to address critical barriers to data use.
If you’re thinking about integrating design thinking approaches into data-centered projects, here are our five key considerations to take into account before you begin:
Design thinking is a mindset, not a workshop agenda. When you’re setting out to incorporate design thinking into your work, consider what that means throughout the project lifecycle. From continuous engagement and touchpoints with your data users to
Engage the right people – you need a diverse range of perspectives and experiences to uncover problems and co-create solutions. This means thinking of the usual stakeholders using the data at hand, but also engaging those adjacent to the data. In health information systems, this could be the clinicians reporting on the registers, the mid-level managers at the district health office, and even the printer responsible for distributing paper registers.
Plan for the long haul. Don’t limit your planning and projections of time, resources, and end user engagement to initial workshops. Coming out of your initial design workshops, you’ll likely have prototypes that require continued attention to functionally build and implement.
Focus on identifying and understanding the problem you’ll be solving. You’ll never be able to solve every problem and overcome every data use barrier in one workshop (or even in one project). Work with your users to develop a specific focus and thoroughly understand the barriers and challenges from their perspectives so you can tackle the most pressing issues (or choose deliberately to work on longer term solutions to the largest impediments).
The journey matters as much as the destination. One of the greatest ah-ha moments coming out of these workshops has been from participants who see opportunities to change how they facilitate meetings or manage teams by adopting some of the activities and facilitation approaches in their own work. Adoption of the prototypes shouldn’t be your only metric of success.
The Designing for Data Use workshops were funded by (1) USAID and implemented by the MEASURE Evaluation project and (2) the Global Fund through the Data Use Innovations Fund. Matchboxology was the design partner for both sets of workshops, and John Snow Inc. was the technical partner for the Data Use Innovations sessions. Learn more about the process and learning from the MEASURE Evaluation workshops in Applying User Centered Design to Data Use Challenges: What we Learned and see our slides from our MERL Tech session “The Simpsons, Design, and Data Use” to learn more.
It didn’t surprise me when I learned that — when Ministry of Finance officials conduct trainings on the Aid Management Platform for Village Chiefs, CSOs and citizens throughout the districts of Malawi — officials are almost immediately asked:
“What were the results of these projects? What were the outcomes?”
It didn’t just matter what development organizationssaid they would do — it also mattered what they actually did.
We’ve heard the same question echoed by a number of agriculture practitioners interviewed as part of the Initiative for Open Ag Funding. When asked what information they need to make better decisions about where and how to implement their own projects, many replied:
“We want to know — if [others] were successful — what did they do? If they weren’t successful, what shouldn’t we do?”
This interest in understanding what went right (or wrong) came not from wanting to point fingers, but from genuine desire to learn. In considering how to publish and share data, the importance of — and interest in! — learning cannot be understated.
At MERL Tech DC earlier this month, we decided to explore the International Aid Transparency Initiative (IATI) format, currently being used by organizations and governments globally for publishing aid and results data. For this hands-on exercise, we printed different types of projects from the D-Portal website, including any evaluation documents included in the publication. We then asked participants to answer the following questions about each project:
What were the successes of the project?
What could be replicated?
What are the pitfalls to be avoided?
Where did it fail?
Taryn Davis leading participants through using IATI results data at MERLTech DC
We then discussed whether participants were (or were not) able to answer these questions with the data provided. Here is the Good, the Bad, and the Ugly of what participants shared:
Many were impressed that this data — particularly the evaluation documents — were even shared and made public, not hidden behind closed doors.
For those analyzing evaluation documents, the narrative was helpful for answering our four questions, versus having just the indicators without any context.
One attendee noted that this data would be helpful in planning project designs for business development purposes.
There were challenges with data quality — for example, some data were missing units, making it difficult to identify — was the number “50” a percent, a dollar amount, or another unit?
Some found the organizations’ evaluation formats easier to understand than what was displayed on D-portal. Others were given evaluations with a more complex format, making it difficult to identify key takeaways. Overall, readability varied, and format matters. Sometimes less columns is more ( readable). There is a fine line between not enough information (missing units), and a fire hose of information (gigantic documents).
Since the attachments included more content in narrative format, they were more helpful in answering our four questions than just the indicators that were entered in the IATI standard.
There were no visualizations for a quick takeaway on project success. A visual aid would help understand “successes” and “failures” quicker without having spend as much time digging and comparing, and could then spend more time looking at specific cases and focusing on the narrative.
Some data was missing time periods, making it hard to know how relevant it would be for those interested in using the data.
Data was often disorganized, and included spelling mistakes.
Reading data “felt like reading the SAT”: challenging to comprehend.
The data and documents weren’t typically forthcoming about challenges and lessons learned.
Participants weren’t able to discern any real, tangible learning that could be practically applied to other projects.
Fortunately, the “Bad” elements can be relatively easily addressed. We’ve spent time reviewing results data for organizations published in IATI, providing feedback to improve data quality, and to make the data cleaner and easier to understand.
However, the “ugly” elements are really key for organizations that want to share their results data. To move beyond a “transparency gold star,” and achieve shared learning and better development, organizations need to ask themselves:
“Are we publishing the right information, and are we publishing it in a usable format?”
As we noted earlier, it’s not just the indicators that data users are interested in, but how projects achieved (or didn’t achieve) those targets. Users want to engage in the “L” in Monitoring, Evaluation, and Learning (MERL). For organizations, this might be as simple as reporting “Citizens weren’t interested in adding quinoa to their diet so they didn’t sell as much as expected,” or “The Village Chief was well respected and supported the project, which really helped citizens gain trust and attend our trainings.”
This learning is important both for organizations internally, enabling them tounderstand and learn from the data; it’s also important for the wider development community. In hindsight, what do you wish you had known about implementing an irrigation project in rural Tanzania before you started? That’s what we should be sharing.
In order to do this, we must update our data publishing formats (and mindsets) so that we can answer questions like, “How did this project succeed? What can be replicated? What are the pitfalls to avoid? Where did it fail?” Answering these kinds of questions — and enabling actual learning — should be a key goal for all project and programs; and it should not feel like an SAT exam every time we do so.
We are living in an increasingly quantified world.
There are multiple sources of data that can be generated and analyzed in real-time. They can be synthesized to capture complex interactions among data streams and to identify previously unsuspected linkages among seemingly unrelated factors [such as the purchase of diapers and increased sales of beer]. We can now quantify and monitor ourselves, our houses (even the contents of our refrigerator!), our communities, our cities, our purchases and preferences, our ecosystem, and multiple dimensions of the state of the world.
These rich sources of data are becoming increasingly accessible to individuals, researchers and businesses through huge numbers of mobile phone and tablet apps and user-friendly data analysis programs.
The influence of digital technology on international development is growing.
Many of these apps and other big data/data analytics tools are now being adopted by international development agencies. Due to their relatively low cost, ease of application, and accessibility in remote rural areas, the approaches are proving particularly attractive to non-profit organizations; and the majority of NGOs probably now use some kind of mobile phone apps.
Apps are widely-used for early warning systems, emergency relief, dissemination of information (to farmers, mothers, fishermen and other groups with limited access to markets), identifying and collecting feedback from marginal and vulnerable groups, and permitting rapid analysis of poverty. Data analytics are also used to create integrated data bases that synthesize all of the information on topics as diverse as national water resources, human trafficking, updates on conflict zones, climate change and many other development topics.
Table 1: Widely used big data/data analytics applications in international development
Big data/data analytics tools
Early warning systems for natural and man-made disasters
Analysis of Twitter, Facebook and other social media
Analysis of radio call-in programs
Satellite images and remote sensors
Electronic transaction records [ATM, on-line purchases]
GPS mapping and tracking
Dissemination of information to small farmers, mothers, fishermen and other traders
Feedback from marginal and vulnerable groups and on sensitive topics
Rapid analysis of poverty and identification of low-income groups
Analysis of phone records
Social media analysis
Satellite images [e.g. using thatched roofs as a proxy indicator of low-income households]
Electronic transaction records
Creation of an integrated data base synthesizing all the multiples sources of data on a development topic
National water resources
Agricultural conditions in a particular region
Evaluation is lagging behind.
Surprisingly, program evaluation is the area that is lagging behind in terms of the adoption of big data/analytics. The few available studies report that a high proportion of evaluators are not very familiar with big data/analytics and significantly fewer report having used big data in their professional evaluation work. Furthermore, while many international development agencies have created data development centers within the past few years, many of these are staffed by data scientists (many with limited familiarity with conventional evaluation methods) and there are weak institutional links to agency evaluation offices.
A recent study on the current status of the integration of big data into the monitoring and evaluation of development programs identified a number of reasons for the slow adoption of big data/analytics by evaluation offices:
Weak institutional links between data development centers and evaluation offices
Differences of methodology and the approach to data generation and analysis
Issues concerning data quality
Concerns by evaluators about the commercial, political and ethical nature of how big data is generated, controlled and used.
Key questions for the future of evaluation in international development…
The above gives rise to two sets of questions concerning the future role of evaluation in international development:
The future direction of development evaluation. Given the rapid expansion of big data in international development, it is likely there will be a move towards integrated program information systems. These will begin to generate, analyze and synthesize data for program selection, design, management, monitoring, evaluation and dissemination. A possible scenario is that program evaluation will no longer be considered a specialized function that is the responsibility of a separate evaluation office, rather it will become one of the outputs generated from the program data base. If this happens, evaluation may be designed and implemented not by evaluation specialists using conventional evaluation methods (experimental and quasi-experimental designs, theory-based evaluation) but by data analysts using methods such as predictive analytics and machine learning.
Key Question: Is this scenario credible? If so how widespread will it become and over what time horizon? Is it likely that evaluation will become one of the outputs of an integrated management information system? And if so is it likely that many of the evaluation functions will be taken over by big data analysts?
The changing role of development evaluators and the evaluation office. We argued that currently many or perhaps most development evaluators are not very familiar with big data/analytics, and even fewer apply these approaches. There are both professional reasons (how evaluators and data scientists are trained) and organizational reasons (the limited formal links between evaluation offices and data centers in many organizations) that explain the limited adoption of big data approaches by evaluators. So, assuming the above scenario proves to be at least partially true, what will be required for evaluators to become sufficiently conversant with these new approaches to be able to contribute to how big data/focused evaluation approaches are designed and implemented? According to Pete York at Communityscience.com, the big challenge and opportunity for evaluators is to ensure that the scientific method becomes an essential part of the data analytics toolkit. Recent studies by the Global Environmental Faciity (GEF) illustrate some of the ways that big data from sources such as satellite images and remote sensors can be used to strengthen conventional quasi-experimental evaluation designs. In a number of evaluations these data sources used propensity score matching to select matched samples for pretest-posttest comparison group designs to evaluate the effectiveness of programs to protect forest cover or reserves for mangrove swamps.
Key Question: Assuming there will be a significant change in how the evaluation function is organized and managed, what will be required to bridge the gap between evaluators and data analysts? How likely is it that the evaluators will be able to assume this new role and how likely is it that organizations will make the necessary adjustments to facilitate these transformations?
What do you think? How will these scenarios play out?
Note: Stay tuned for Michael’s next post focusing on how to build bridges between evaluators and big data analysts.
Below are some useful references if you’d like to read more on this topic:
Bamberger, M., Raftree, L and Olazabal, V (2016) The role of new information and communication technologies in equity–focused evaluation: opportunities and challenges. Evaluation. Vol 22(2) 228–244 . A discussion of the ethical issues and challenges with new information technology
Meier , P (2015) Digital Humanitarians: How big data is changing the face of humanitarian response. CRC Press. A review, with detailed case studies, of how digital technology is being used by NGOs and civil society.
O’Neill, C (2016) The weapons of math destruction: How big data increases inequality and threatens democracy. How widely-used digital algorithms negatively affect the poor and marginalized sectors of society. Crown books.
Petersson, G.K and Breul, J.D (editors) (2017) Cyber society, big data and evaluation. Comparative policy evaluation. Volume 24. Transaction Publications. The evolving role of evaluation in cyber society.
by Linda Raftree, Independent Consultant and MERL Tech Organizer
It can be overwhelming to get your head around all the different kinds of data and the various approaches to collecting or finding data for development and humanitarian monitoring, evaluation, research and learning (MERL).
Though there are many ways of categorizing data, lately I find myself conceptually organizing data streams into four general buckets when thinking about MERL in the aid and development space:
‘Traditional’ data. How we’ve been doing things for(pretty much)ever. Researchers, evaluators and/or enumerators are in relative control of the process. They design a specific questionnaire or a data gathering process and go out and collect qualitative or quantitative data; they send out a survey and request feedback; they do focus group discussions or interviews; or they collect data on paper and eventually digitize the data for analysis and decision-making. Increasingly, we’re using digital tools for all of these processes, but they are still quite traditional approaches (and there is nothing wrong with traditional!).
‘Found’ data. The Internet, digital data and open data have made it lots easier to find, share, and re-use datasets collected by others, whether this is internally in our own organizations, with partners or just in general.These tend to be datasets collected in traditional ways, such as government or agency data sets. In cases where the datasets are digitized and have proper descriptions, clear provenance, consent has been obtained for use/re-use, and care has been taken to de-identify them, they can eliminate the need to collect the same data over again. Data hubs are springing up that aim to collect and organize these data sets to make them easier to find and use.
‘Seamless’ data. Development and humanitarian agencies are increasingly using digital applications and platforms in their work — whether bespoke or commercially available ones. Data generated by users of these platforms can provide insights that help answer specific questions about their behaviors, and the data is not limited to quantitative data. This data is normally used to improve applications and platform experiences, interfaces, content, etc. but it can also provide clues into a host of other online and offline behaviors, including knowledge, attitudes, and practices. One cautionary note is that because this data is collected seamlessly, users of these tools and platforms may not realize that they are generating data or understand the degree to which their behaviors are being tracked and used for MERL purposes (even if they’ve checked “I agree” to the terms and conditions). This has big implications for privacy that organizations should think about, especially as new regulations are being developed such a the EU’s General Data Protection Regulations (GDPR). The commercial sector is great at this type of data analysis, but the development set are only just starting to get more sophisticated at it.
‘Big’ data. In addition to data generated ‘seamlessly’ by platforms and applications, there are also ‘big data’ and data that exists on the Internet that can be ‘harvested’ if one only knows how. The term ‘Big data’ describes the application of analytical techniques to search, aggregate, and cross-reference large data sets in order to develop intelligence and insights. (See this post for a good overview of big data and some of the associated challenges and concerns). Data harvesting is a term used for the process of finding and turning ‘unstructured’ content (message boards, a webpage, a PDF file, Tweets, videos, comments), into ‘semi-structured’ data so that it can then be analyzed. (Estimates are that 90 percent of the data on the Internet exists as unstructured content). Currently, big data seems to be more apt for predictive modeling than for looking backward at how well a program performed or what impact it had. Development and humanitarian organizations (self included) are only just starting to better understand concepts around big data how it might be used for MERL. (This is a useful primer).
Thinking about these four buckets of data can help MERL practitioners to identify data sources and how they might complement one another in a MERL plan. Categorizing them as such can also help to map out how the different kinds of data will be responsibly collected/found/harvested, stored, shared, used, and maintained/ retained/ destroyed. Each type of data also has certain implications in terms of privacy, consent and use/re-use and how it is stored and protected. Planning for the use of different data sources and types can also help organizations choose the data management systems needed and identify the resources, capacities and skill sets required (or needing to be acquired) for modern MERL.
Organizations and evaluators are increasingly comfortable using mobile and/or tablets to do traditional data gathering, but they often are not using ‘found’ datasets. This may be because these datasets are not very ‘find-able,’ because organizations are not creating them, re-using data is not a common practice for them, the data are of questionable quality/integrity, there are no descriptors, or a variety of other reasons.
The use of ‘seamless’ data is something that development and humanitarian agencies might want to get better at. Even though large swaths of the populations that we work with are not yet online, this is changing. And if we are using digital tools and applications in our work, we shouldn’t let that data go to waste if it can help us improve our services or better understand the impact and value of the programs we are implementing. (At the very least, we had better understand what seamless data the tools, applications and platforms we’re using are collecting so that we can manage data privacy and security of our users and ensure they are not being violated by third parties!)
Big data is also new to the development sector, and there may be good reason it is not yet widely used. Many of the populations we are working with are not producing much data — though this is also changing as digital financial services and mobile phone use has become almost universal and the use of smart phones is on the rise. Normally organizations require new knowledge, skills, partnerships and tools to access and use existing big data sets or to do any data harvesting. Some say that big data along with ‘seamless’ data will one day replace our current form of MERL. As artificial intelligence and machine learning advance, who knows… (and it’s not only MERL practitioners who will be out of a job –but that’s a conversation for another time!)
Not every organization needs to be using all four of these kinds of data, but we should at least be aware that they are out there and consider whether they are of use to our MERL efforts, depending on what our programs look like, who we are working with, and what kind of MERL we are tasked with.
I’m curious how other people conceptualize their buckets of data, and where I’ve missed something or defined these buckets erroneously…. Thoughts?
By Hur Hassnain, Monitoring, Evaluation, Accountability and Learning Adviser, War Child UK
At the 2017 MERL Tech London conference, my team and I gave a presentation that addressed the possibilities for and limitations of evaluating complex situations using simple Excel-based tools. The question we explored was: can Excel help us manipulate data to create predictive models and suggest promising avenues to project success? Our basic answer was “not yet,” at least not to its full extent. However, there are people working with accessible software like Excel to make analysis simpler for evaluators with less technical expertise.
In our presentation, Rick Davies, Mark Skipper and I showcased EvalC3, an Excel based evaluation tool that enables users to easily identify sets of attributes in a project dataset and to then compare and evaluate the relevance of these attributes to achieving the desired outcome. In other words, it helps answer the question ‘what combination of factors helped bring about the results we observed?’ In the presentation, after we explained what EvalC3 is and gave a live demonstration of how it works, we spoke about our experience using it to analyze real data from a UNICEF funded War Child UK project in Afghanistan–a project that helps children who have been deported back to Afghanistan from Iran.
Our team first learned of EvalC3 when, upon returning from a trip to our Afghanistan country programme, we discussed how our M&E team in Afghanistan uses Excel for storing and analysing data but is not able to use the software to explore or evaluate complex causal configurations. We reached out to Rick with this issue, and he introduced us to EvalC3. It sounded like the solution to our problem, and our M&E officer in Afghanistan decided to test it by using it to dig deeper into an Excel database he’d created to store data on one thousand children who were registered when they were deported to Afghanistan.
Rick, Hosain Hashmi (our M&E Officer in Afghanistan) and I formed a working group on Skype to test drive EvalC3. First, we needed to clean the data. To do this, we asked our social workers to contact the children and their caretakers to collect important missing data. Missing data is a common problem when collecting data in fragile and conflict affected contexts like those where War Child works. Fortunately, we found that EvalC3 algorithms can work with some missing data, with the tradeoff being slightly less accurate measures of model performance. Compare this to other algorithms (like Quine-McCluskey used in QCA) which do not work at all if the data is missing for some variables. We also had to reduce the number of dimensions we used. If we did not, there could be millions of combinations that could be possible outcome predictors, and an algorithm could not search all of these possibilities in a reasonable span of time. This exercise spoke to M. A. Munson’s theory that “model building only consumes 14% of the time spent on a typical [data mining] project; the remaining time is spent on the pre and post processing steps”.
With a few weeks of work on the available dataset of children deported from Iran, we found that the children who are most likely to go back to Iran for economic purposes are mainly the children who:
Are living with friends (instead of with. relatives/caretakers)
Had not been doing farming work when they were in Iran
Had not completed 3 months vocational training
Are from adult headed households (instead of from child headed households).
As the project is still ongoing, we will continue to investigate the cases covered by the model described here in order to better understand the causal mechanisms at work.
This experience of using EvalC3 encouraged War Child to refine the data it routinely collects with a view to developing a better understanding of where War Child interventions help or don’t help. The in-depth data-mining process and analysis conducted by the national M&E Officer and programmes team resulted in improved understanding of the results we can achieve by analyzing quality data. EvalC3 is a user-friendly evaluation tool that is not only useful in improving current programmes but also designing new and evidence based programmes.
By Claire Benard, formerly of Crisis UK and now with National Council for Voluntary Organizations (NCVO).
Most people who work with data in MERL will have heard of R. Some people will have been properly introduced to it, but only a few will invest the necessary time in learning how to use it. Being a relatively late convert, I wanted to share my experience of moving from a traditional data analysis software package to a language based one, so I did a Lightning Talk at MERL Tech London. (You can watch the video below.)
First things first, what is R?
Aside from being the 18th letter of the alphabet, R is also a language and environment for statistical computing and graphics.
But wait, you say… why should I use it?
This is what the five-minute video below is about, but in short, here are a few reasons:
There is nothing your current software package does that R doesn’t do.
R is free.
Using a programming language makes the analysis easy to reproduce, whether it’s because you need to produce similar analysis year on year or because you have a team of analysts who need to collaborate and understand each other’s work.
R is an open source technology. People from all backgrounds contribute to it and make new tools available for free regularly. This is you’re insurance to stay at the cutting edge of what is being developed.
Well, then, how do I get started? you wonder…
If you’re more MERL than Tech, learning a new programming language can be daunting. There is a time and money cost to it and it’s hard to know where to start if you’re on your own.
In the video, I give a few tips. It’s also worth checking out free/cheap training online (for example here or here) ; looking out for a user group near you and getting advice from blogs, forums and newsletters.
Post by Julia Connors of Voltaicsystems. Email Julia with questions: firstname.lastname@example.org
What is solar for M&E?
Solar technology can be extremely useful for M&E projects in areas with minimal or inconsistent access to power. Portable solar chargers can eliminate power constraints and keep phones and tablets reliably charged up in the field.
In this post we’ll discuss:
How to decide if solar is right for your project
How to properly size a solar charging system to meet your needs
Do you really need solar?
In many cases solar is not necessary and will simply add complexity and costs to your project. If your team can return every day to a central location with access to power, then the battery power of the tablet is sufficient in most scenarios. If not, we recommend implementing standard power saving tips to reduce power consumption during time out collecting data.
POWER SAVING TIPS
If you do have daily access to the grid but find that users need to recharge at least once while out or need to spend more than one day without power, then add an external battery pack. This cost-effective option allows your team to have extra power without carrying a full solar charging system. To size a battery for your needs, skip down to ‘Step 3’ below.
If you don’t have reliable access to grid power, the next section will help you determine which size solar charging system is best for you.
Sizing your solar charger system
The key to making solar successful in your project is finding the best system for your needs. If a system is underpowered then your team can still run out of power when they’re collecting data. On the other hand, if your system is too powerful it will be heavier and more expensive than needed. We recommend the following three steps for sizing your solar needs:
Estimate your daily power consumption
Determine your minimum solar panel size
Determine your minimum battery size
Step 1: Estimate your daily power consumption
Once you have chosen the device you will be using in the field, it’s easy to determine your daily power consumption. First you’ll need to figure out the size of your device’s battery (in Watt hours). This can often be found by looking on the back of the battery itself or doing a quick Google search to find your device’s technical specifications.
Next, you’ll need to determine your battery usage per day. For example, if you use half of your device’s battery on a typical day of data collection, then your usage is 50%. If you need to recharge twice in one day, then your usage is 200%.
Once you have those numbers, use the formula below to find your daily power consumption:
Size of Device’s Battery (Wh) x Battery Usage (per day) =
Daily Power Consumption (Wh/day)
Step 2: Determine your minimum solar panel size
The larger your device, the bigger the solar panel (measured in Watts) you’ll need. This is because larger solar panels can generate more power from the sun than smaller panels. To determine the best solar panel size for your needs, use our formula below:
Daily Power Consumption (from Step 1) / Expected Hours of Good Sun*
x 2 (Standard Power Loss Variable) =
Solar Panel Minimum (Watts)
*We typically use 5 hours as a baseline for good sun and then adjust up or down depending on the conditions. High temperatures, clouds, or shading will reduce the power produced by the panel.
Since solar conditions change frequently throughout the day, we recommend choosing a panel that is 2-4 times the minimum size required.
Step 3: Determine minimum battery size
External batteries offer extra power storage so that your device will be charged when you need it. The battery acts as a perfect backup on cloudy and rainy days so it’s important to choose the right size for your device.
It can vary, but typically about 30% of power is lost in the transfer from the external battery to your device. Therefore, to determine the battery capacity needed for one day of use, we’ll use our power consumption data from Step 1 and divide by 0.7 (100% – 30% power loss).
Watt hours per day / 0.7 hours =
Watt battery capacity needed for 1 day of use
Picking the right system for your project
Now that you’ve done the math, you’re one step closer to choosing a solar charging system for your project. Since solar chargers come in many different forms, the last step to determining your perfect system is to think about how your team will be using the solar chargers in their work. It’s important to factor in storage for device/cables and how the user will be carrying the system.
Most users aren’t that technical, so having a pack that stores the battery and the device can simplify their experience (rather than handing over a battery and a panel that they need to figure how to organize during their day). By simply finding the right style and size, you’ll experience higher usage rates and make your team’s solar-powered data collection go more smoothly.