Our first webinar in the series Emerging Data Landscapes in M&E, on Geospatial, location and big data: Where have we been and where can we go? was held on 28 July. We had a lively discussion on the use of these innovative technologies in the world of evaluation.
First, Estelle Raimondo, Senior Evaluation Officer at the World Bank Independent Evaluation Group, framed the discussion with her introduction on Evaluation and emerging data: what are we learning from early applications? She noted how COVID-19 has been an accelerator of change, pushing the evaluation community to explore new, innovative technologies to overcome today’s challenges, and set the stage for the ethical, conceptual and methodical considerations we now face.
Next came the Case Study: Integrating geospatial methods into evaluations: opportunities and lessons from Anupam Anand, Evaluation Officer at the Global Environmental Facility, Independent Evaluation Office, and Hur Hassnain, Senior Evaluation Advisor, European Commission DEVCO/ESS. After providing an overview of the advantages of using satellite and remote sensing data, particularly in fragile and conflict zones, the presenters gave the examples of their use in Syria and Sierra Leone.
The second Case Study: Observing from space when you cannot observe from the field, was presented by Joachim Vandercasteelen, Young Professional at World Bank Independent Evaluation Group. This example focused on using geospatial data for evaluating a biodiversity conservation project in Madagascar, as traveling to the field was not feasible. The presentation gave an overview on how to use such technology for both quantitative and qualitative assessments, but also the downsides to consider.
The full recording of the webinar, including the PowerPoint Presentations and Questions & Answers session at the end, are available on the EES’ YouTube page.
Over the next month, we will release specific blogs of each of the presentations, where the speakers will answer the questions participants raised during the webinar that were not already addressed during the Q&A, and provide the links to further reading on the subject. These will be publicly available on the EES Blog.
The year 2020 is a compelling time to look back and pull together lessons from five years of convening hundreds of monitoring, evaluation, research, and learning and technology practitioners who have joined us as part of the MERL Tech community. The world is in the midst of the global COVID-19 pandemic, and there is an urgent need to know what is happening, where, and to what extent. Data is a critical piece of the COVID-19 response — it can mean the difference between life and death. And technology use is growing due to stay-at-home orders and a push for “remote monitoring” and data collection from a distance.
At the same time, we’re witnessing (and I hope, also joining in with) a global call for justice — perhaps a tipping point — in the wake of decades of racist and colonialist systems that operate at the level of nations, institutions, organizations, the global aid and development systems, and the tech sector. There is no denying that these power dynamics and systems have shaped the MERL space as a whole, and the MERL Tech space as well.
Moments of crisis tend to test a field, and we live in extreme times. The coming decade will demand a nimble, adaptive, fair, and just use of data for managing complexity and for gaining longer-term understanding of change and impact. Perhaps most importantly, in 2020 and beyond, we need meaningful involvement of stakeholders at every level and openness to a re-shaping of our sector and its relationships and power dynamics.
It is in this time of upheaval and change that we are releasing a set of four papers that aim to take stock of the field from 2014-2019 as launchpad for shaping the future of MERL Tech. In September 2018, the papers’ authors began reviewing the past five years of MERL Tech events to identify lessons, trends, and issues in this rapidly changing field. They also reviewed the literature base in an effort to determine what we know, what we yet need to understand about technology in MERL, and what are the gaps in the formal literature. No longer is this a nascent field, yet it is one that is hard to keep up with, given that it is fast paced and constantly shifting with the advent of new technologies. We have learned many lessons over the past five years, but complex political, technical, and ethical questions remain.
The State of the Field series includes four papers:
What We Know About Traditional MERL Tech: Insights from a Scoping Review: Zach Tilton, Michael Harnar, and Michele Behr, University of Western Michigan; Soham Banerji and Manon McGuigan, independent consultants; and Paul Perrin, Gretchen Bruening, John Gordley and Hannah Foster, University of Notre Dame; Linda Raftree, independent consultant and MERL Tech Conference organizer.
Through these papers, we aim to describe the State of the Field up to 2019 and to offer a baseline point in time from which the wider MERL Tech community can take action to make the next phase of MERL Tech development effective, responsible, ethical, just, and equitable. We share these papers as conversation pieces and hope they will generate more discussion in the MERL Tech space about where to go from here.
We’d like to start or collaborate on a second round of research to delve into areas that were under-researched or less developed. Your thoughts are most welcome on topics that need more research, and if you are conducting research about MERL Tech, please get in touch and we’re happy to share here on MERL Tech News or to chat about how we could work together!
Big data is a big topic in other sectors but its application within monitoring and evaluation (M&E) is limited, with most reports focusing more on its potential rather than actual use. Our paper, “Big Data to Data Science: Moving from ‘What’ to ‘How’ in the MERL Tech Space” probes trends in the use of big data between 2014 and 2019 by a community of early adopters working in monitoring, evaluation, research, and learning (MERL) in the development and humanitarian sectors. We focus on how MERL practitioners actually use big data and what encourages or deters adoption.
First, we collated administrative and publicly available MERL Tech conference data from the 281 sessions accepted for presentation between 2015 and 2019. Of these, we identified 54 sessions that mentioned big data and compared trends between sessions that did and did not mention this topic. In any given year from 2015 to 2019, 16 percent to 26 percent of sessions at MERL Tech conferences were related to the topic of big data. (Conferences were held in Washington DC, London, and Johannesburg).
Our quantitative analysis was complemented by 11 qualitative key informant interviews. We selected interviewees representing diverse viewpoints (implementers, donors, MERL specialists) and a range of subject matter expertise and backgrounds. During interviews, we explored why an interviewee chose to use big data, the benefits and challenges of using big data, reflections on the use of big data in the wider MERL tech community, and opportunities for the future.
Our findings indicate that MERL practitioners are in a fragmented, experimental phase, with use and application of big data varying widely, accompanied by shifting terminologies. One interviewee noted that “big data is sort of an outmoded buzzword” with practitioners now using terms such as ‘artificial intelligence’ and ‘machine learning.’ Our analysis attempted to expand the umbrella of terminologies under which big data and related technologies might fall. Key informant interviews and conference session analysis identified four main types of technologies used to collect big data: satellites, remote sensors, mobile technology, and M&E platforms, as well as a number of other tools and methods. Additionally, our analysis surfaced six main types of tools used to analyze big data: artificial intelligence and machine learning, geospatial analysis, data mining, data visualization, data analysis software packages, and social network analysis.
Barriers to adoption
We also took an in-depth look at barriers to and enablers of use of big data within MERL, as well as benefits and drawbacks. Our analysis found that perceived benefits of big data included enhanced analytical possibilities, increased efficiency, scale, data quality, accuracy, and cost-effectiveness. Big data is contributing to improved targeting and better value for money. It is also enabling remote monitoring in areas that are difficult to access for reasons such as distance, poor infrastructure, or conflict.
Concerns about bias, privacy, and the potential for big data to magnify existing inequalities arose frequently. MERL practitioners cited a number of drawbacks and limitations that make them cautious about using big data. These include lack of trust in the data (including mistrust from members of local communities); misalignment of objectives, capacity, and resources when partnering with big data firms and the corporate sector; and ethical concerns related to privacy, bias, and magnification of inequalities. Barriers to adoption include insufficient resources, absence of relevant use cases, lack of skills for big data, difficulty in determining return on investment, and challenges in pinpointing the tangible value of using big data in MERL.
Our paper includes a series of short case studies of big data applications in MERL. Our research surfaced a need for more systematic and broader sharing of big data use cases and case studies in the development sector.
The field of Big Data is rapidly evolving, thus we expect that shifts have happened already in the field since the beginning of our research in 2018. We recommend several steps for advancing with Big Data / Data Science in the MERL Space, including:
Consider. MERL Tech practitioners should examine relevant learning questions before deciding whether big data is the best tool for the MERL job at hand or whether another source or method could answer them just as well.
Pilot testing of various big data approaches is needed in order to assess their utility and the value they add. Pilot testing should be collaborative; for example, an organization with strong roots at the field level might work with an agency that has technical expertise in relevant areas.
Documenting. The current body of documentation is insufficient to highlight relevant use cases and identify frameworks for determining return on investment in big data for MERL work. The community should do more to document efforts, experiences, successes, and failures in academic and gray literature.
Sharing. There is a hum of activity around big data in the vibrant MERL Tech community. We encourage the MERL Tech community to engage in fora such as communities of practice, salons, events, and other convenings, and to seek less typical avenues for sharing information and learning and to avoid knowledge silos.
Learning. The MERL Tech space is not static; indeed, the terminology and applications of big data have shifted rapidly in the past 5 years and will continue to change over time. The MERL Tech community should participate in new training related to big data, continuing to apply critical thinking to new applications.
Guiding. Big data practitioners are crossing exciting frontiers as they apply new methods to research and learning questions. These new opportunities bring significant responsibility. MERL Tech programs serve people who are often vulnerable — but whose rights and dignity deserve respect. As we move forward with using big data, we must carefully consider, implement, and share guidance for responsible use of these new applications, always honoring the people at the heart of our interventions.
Guest post, Lauren Weiss, European Evaluation Society
As you may be aware, the European Evaluation Society’s biennial conference has been postponed to September 2021, due to the COVID-19 pandemic.
In the meantime, EES is continuing to work for you, and we are excited to announce the launch of two new initiatives.
First, our new podcast series, EvalEdge, is now available! It focuses on the role of evaluation in shaping how new and emerging technologies can be adapted in international development and in larger society. It explores the latest technological developments, from dig data and geospatial analysis, to blockchain and Internet of Things (IoTs).
Our first episode features MERL Tech’s co-founder Linda Raftree, who discusses innovative examples of using big data, the ethical considerations to be aware of, and much more! Check it out here!
Building on this momentum, EES is also launching a webinar series titled “Emerging Data Landscapes in M&E.” In partnership with Dev CAFÉ, MERL Tech, and the World Bank IEG, this series is devoted to discussing the use of innovative technologies in the world of evaluation.
This interactive and free webinar will provide concrete examples of using geospatial and location data to improve our M&E practices. It will also discuss the barriers to using such technologies and brainstorm on ways to overcome them, by inviting feedback and questions from the online audience.
It will include speakers from the World Bank IEG, the European Commission’s DEVCO/ESS, and the Global Environment Facility. You can find more information on our website.
Big data comes with big responsibilities, where both the funder and recipient organization have ethical and data security obligations.
Big data allows organizations to count and bring visibility to marginalized populations and to improve on decision-making. However, concerns of data privacy, security and integrity pose challenges within data collection and data preservation. What does informed consent look like in data collection? What are the potential risks we bring to populations? What are the risks of compliance?
The session highlighted three takeaways organizations should consider when approaching data security.
1) Language Barriers between Evaluators and Data Scientists
Both Roytman and Woods agreed that the divide between evaluators and data scientists is the lack of knowledge of the others’ field. How do you ask a question when you didn’t know you had to?
In Woods’ experience, the Monitoring and Evaluation team and IT team each have a role in data security, but work independently. The evolving field of M+E inhibits time for staying attuned to what data security needs. Additionally, the organization’s limited resources can impede the IT team from supporting programmatic data security.
A potential solution ChildFund has considered is investing in an IT person with a focus on MERL who has experience and knowledge in the international or humanitarian sphere. However, many organizations fall short when it comes to financing data security. In addition, identifying an individual with these skills can be challenging.
2) Data Collection
Data breaches exposes confidential information, which puts vulnerable populations at risk of exploitative use of their data and potential harm. As we gather data, this constitutes a question about what informed consent looks like? Are we communicating the risks to beneficiaries of releasing their personal information?
In Woods’ experience, ChildFund approaches data security through a child-safeguarding lens across stakeholders and program participants, where all are responsible for data security. Its child safeguarding policy entails data security protocol and privacy; however, Woods mentioned the dissemination and implementation across countries is a lingering question. Many in-country civil society organizations lack capacity, knowledge, and resources to implement data security protocols, especially if they are working in a country context that does not have laws, regulations or frameworks related to data security and privacy. Currently, ChildFund is advocating for refresher trainings on policy for all involved global partnerships to be updated on organizational changes.
3) Data Preservation
The issue of data breaches is a privacy concern when organizations’ data includes sensitive information of individuals. This puts beneficiaries at-risk of exploitation by bad actors. Roytman explained that there are specific actors, risks, and threats that affect specific kinds of data; though, humanitarian aid organizations are not always a primary target. Nonetheless, this shouldn’t distract organizations from potential risks, but open discussion around how to identify and mitigate risks?
Protecting sensitive data requires a proper security system, something that not all platforms provide, especially if they are free. Ultimately, security is a financial investment that requires time in order to avoid and mitigate risks and potential threats. In order to increase support and investment in security, ChildFund is working with Dharma to pilot a small program to demonstrate the use of big data analytics with a built in data security system.
Roytman suggested approaching ethical concerns by applying the CIA Triad: Confidentiality, Availability and Integrity. There will always be tradeoffs, he said. If we don’t properly invest in data security and mitigate potential risks, there will be additional challenges to data collection. If we don’t understand data security, how can we ensure informed consent?
Many organizations find themselves doing more harm than good due to lack of funding. Big data can be an inexpensive approach to collecting large quantities of data, but if it leads to harm, there is a problem. This is is a complex issue to resolve, however, as Roytman concluded, the opposite of complexity is not simplicity, but rather transparency.
MERL Tech DC kicked off with a pre-conference workshop on September 5th that focused on what the Blockchain is and how it could influence MEL.
The workshop was broken into four parts: 1) blockchain 101, 2) how the blockchain is influencing and could influence MEL, 3) case studies to demonstrate early lessons learned, and 4) outstanding issues and emerging themes.
This blog focuses and builds on the fourth area. At the end, we provide additional resources that will be helpful to all interested in exploring how the blockchain could disrupt and impact international development at large.
Workshop Takeaways and Afterthoughts
For our purposes here, we have distilled some of the key takeaways from the workshop. This section includes a series of questions that we will respond to and link to various related reference materials.
Who are the main blockchain providers and what are they offering?
Any time a new “innovation” is introduced into the international development space, potential users lack knowledge about what the innovation is, the value it can add, and the costs of implementing it. This lack of knowledge opens the door for “snake oil salesmen” who engage in predatory attempts to sell their services to users who don’t have the knowledge to make informed decisions.
We’ve seen this phenomenon play out with blockchain. Take, for example, the numerous Initial Coin Offerings (ICO’s) that defrauded their investors, or the many instances of service providers offering low quality blockchain education trainings and/or project solutions.
Education is the best defense against being taken advantage of by snake oil salesmen. If you’re looking for general education about blockchain, we’ve included a collection of helpful tools in the table below. If your group is working to determine whether a blockchain solution is right for the problem at hand, the USAID Blockchain Primer offers easy to use decision trees that can help you. Beyond these, Mercy Corp has just published Block by Block, which outlines the attributes of various distributed ledgers along some very helpful lines that are useful when considering what distributed ledger technology to use.
Words of warning aside, there are agencies that provide genuine blockchain solutions. For a full list of providers please visit www.blockchainomics.tech, an information database run by The Development CAFE on all things blockchain.
Bottom Line: Beware the snake oil salesmen preaching the benefits of blockchain but silent on the feasibility of their solution. Unless the service provider is just as focused on your problem as you are, be wary that they are just trying to pitch a solution (viable or not) and not solve the problem. Before approaching the companies or service providers, always identify your problem and see if Blockchain is indeed a viable solutions.
How does governance of the blockchain influence its sustainability?
In the past, we’ve seen technology-led social impact solutions make initial gains that diminished over time until there is no sustained impact. Current evidence shows that many solutions of this sort fail because they are not designed to solve a specific problem in a relevant ecosystem. This insight has given rise to the Digital Development Principles and the Ethical Considerations that should be taken into account for blockchain solutions.
Bottom Line: Impact is achieved and sustained by the people who use a tool. Thus, blockchain, as a tool, does not sustain impacts on its own. People do so by applying knowledge about the principles and ethics needed for impact. Understanding this, our next step is to generate more customized principles and ethical considerations for blockchain solutions through case studies and other desperately needed research.
How do the blockchain, big data, and Artificial Intelligence influence each other?
The blockchain is a new type of distributed ledger system that could have massive social implications. Big Data refers to the exponential increase in data we experience through the Internet of Things (IoT) and other data sources (Smart Infrastructure, etc.). Artificial Intelligence (AI) assists in identifying and analyzing this new data at exponentially faster rates than is currently the case.
Blockchain is a distributed ledger, in essence, a database of transactions, just like any other database, it’s a repository, and it is contributing to the growth of Big Data. AI can be used to automate the process of data entry into the blockchain. This is how the three are connected.
The blockchain is considered a leading contender as the ledger of choice for big data because: 1) due to its distributed nature it can handle much larger amounts of data in a more secure fashion than is currently possible with cloud computing, and 2) it is possible to automate the way big data is uploaded to the blockchain. AI tools are easily integrated into blockchain functions to run searches and analyze data, and this opens up the capacity to collect, analyze and report findings on big data in a transparent and secure manner more efficiently than ever before.
Bit by Bit is a very readable and innovative overview of how to conduct social science research in the digital age of big data, artificial intelligence and the blockchain. It gives the reader a quality introduction into some of the dominant themes and issues to consider when attempting to evaluate either a technology lead solution or use technology to conduct social research.
Given its immutability, how can an adaptive management system work with the blockchain?
This is a critical point. The blockchain is an immutable record, it is almost impossible (meaning it has never been done and there are no simulations where current technology is able to take control of a properly designed blockchain) to hijack, hack, or alter. Thus the blockchain provides the security needed to mitigate corruption and facilitate audits.
This immutability does not mitigate any type of adaptive management approach, however. Adaptive Management requires small iterative course corrections informed by quality data around what is and is not working. This data record and the course corrections provide a rich data set that is extremely valuable to replication efforts because they subvert the main barrier to replication — lack of data on what does and does not work. Hence in this case the immutability of the blockchain is a value add to Adaptive Management. This is more of a question of good adaptive management practices rather than whether the blockchain is a viable tool for these purposes.
It is important to note that you can append information on blocks (not amend), so there will always be a record of previous mistakes (auditability), but the most recent layer of truth is what’s being viewed/queried/verified, etc. Hence, immutability is not a hurdle but a help.
What are the first steps an organization should take when deciding on whether to adopt a blockchain solution?
Each problem that an organization faces is unique, but the following simple steps can help one make a decision:
Identify your problem (using tools such as Developmental Evaluation or Principles of Digital Development)
Understand the blockchain technology, concepts, functionality, requirements and cost
See if your problem can be solved by blockchain rather than a centralized database
Consider the advantages and disadvantages
Identify the right provider and work with them in developing the blockchain
Consider ethical principles and privacy concerns as well as other social inequalities
Deploy in pilot phases and evaluate the results using an agile approach
What can be done to protect PII and other sensitive information on a blockchain?
Blockchain uses cryptography to store its data. That PII and other information cannot be viewed by anyone other than those who have access to the ‘keys’. While developing a blockchain, it’s important to ensure that what goes in is protected and that access to is regulated. Another critical step is promoting literacy on the use of blockchain and its features among stakeholders.
References Correlated to Take Aways
This table organizes current reference materials as related to the main questions we discussed in the workshop. (The question is in the left hand column and the reference material with a brief explanation and hyperlink is in the right hand column).
Resources and Considerations
Who are the main blockchain platforms? Who are the providers and what are they offering?
IBM, ConsenSys, Microsoft, AWS, Cognizant, R3, and others, are offering products and enterprise solutions.
Block by Block is a valuable comparison tool for assessing various platforms.
How does governance of the blockchain influence its sustainability?
See Beeck Center’s Blockchain Ethical Design Framework. Decentralization (how many nodes), equity amongst nodes, rules, transparency are all factors in long-term sustainability. Likewise the Principles for Digital Development have a lot of evidence behind them for their contributions to sustainability.
How do the blockchain, big data and Artificial Intelligence influence each other?
They can be combined in various ways to strengthen a particular service or product. There is no blanket approach, just as there is not blanket solution to any social impact problem. The key is to know the root cause of the problem at hand and how the function of each tool used separately and in conjunction can address these root causes.
Given its immutability, how can an adaptive management system work with the blockchain?
Ask how mistakes are corrected when creating a customized solution, or purchasing a product. Usually, there will be a way to do that, through an easy to use, user interface.
What are the first steps an organization should take when they are deciding on whether to adopt a blockchain solution?
Participate in demos, and test some of the solutions for your own purposes or use cases. Use the USAID Blockchain Primer and reach out to trusted experts to provide advice. Given that the blockchain is primarily open source code, once you have decided that a blockchain is a viable solution for your problem, GitHub is full of open source code that you can modify for your own purposes.
by Zach Tilton, a Peacebuilding Evaluation Consultant and a Doctoral Research Associate at the Interdisciplinary PhD in Evaluation program at Western Michigan University.
In 2013 Dan Airley quipped“Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it….” In 2015 the metaphor was imported to the international development sector by Ben Ramalingam, in 2016 it became a MERL Tech DC lightning talk, and has been ringing in our ears ever since. So, what about 2018? Well, unlike US national trends in teenage sex, there are some signals that big or at least‘bigger’ data is continuing to make its way not only into the realm of digital development, but also evaluation. I recently attended the 2018 MERL Tech DC pre-conference workshop Big Data and Evaluation where participants were introduced to real ways practitioners are putting this trope to bed(sorry, not sorry). In this blog post I share some key conversations from the workshop framed against the ethics of using this new technology, but to do that let me first provide some background.
I entered the workshop on my heels. Given the recent spate of security breaches and revelations about micro-targeting,‘Big Data’ has been somewhat of a boogie-man for myself and others. I have taken some pains to limit my digital data-footprint, have written passionately about big data and surveillance capitalism, and have long been skeptical of big data applications for serving marginalized populations in digital development and peacebuilding. As I found my seat before the workshop started I thought,“Is it appropriate or ethical to use big data for development evaluation?” My mind caught hold of a 2008 Evaluation Café debate between evaluation giants Michael Scriven and Tom Cook on causal inference in evaluation and the ethics of Randomized Control Trials. After hearing Scriven’s concerns about the ethics of withholding interventions from control groups, Cook asks,“But what about the ethics of not doing randomized experiments?” He continues,“What about the ethics of having causal information that is in fact based on weaker evidence and is wrong? When this happens, you carry on for years and years with practices that don’t work whose warrant lies in studies that are logically weaker than experiments provide.”
While I sided with Scriven for most of that debate, this question haunted me. It reminded me of an explanation of structural violence by peace researcher Johan Galtung who writes,“If a person died from tuberculosis in the eighteenth century it would be hard to conceive of this as violence since it might have been quite unavoidable, but if he dies from it today, despite all the medical resources in the world, then violence is present according to our definition.” Galtung’s intellectual work on violence deals with the difference between potential and the actual realizations and what increases that difference. While there are real issues with data responsibility, algorithmic biases, and automated discrimination that need to be addressed, if there are actually existing technologies and resources not being used to address social and material inequities in the world today, is this unethical, even violent?“What about the ethics of not using big data?” I asked myself back. The following are highlights of the actually existing resources for using big data in the evaluation of social amelioration.
Actually Existing Data
During the workshop, Kerry Bruce from Social Impact shared with participants her personal mantra,“We need to do a better job of secondary data analysis before we collect any more primary data.” She challenged us to consider how to make use of the secondary data available to our organizations. She gave examples of potential big data sources such as satellite images, remote sensors, GPS location data, social media, internet searches, call-in radio programs, biometrics, administrative data and integrated data platforms that merge many secondary data files such as public records and social service agency and client files. The key here is there are a ton of actually existing data, many of which are collected passively, digitally, and longitudinally. Despite noting real limitations to accessing existing secondary data, including donor reluctance to fund such work, limited training in appropriate methodologies in research teams, and differences in data availability between contexts, to underscore the potential of using secondary data, she shared a case study where she lead a team to use large amounts of secondary indirect data to identify ecosystems of modern day slavery at a significantly reduced cost than collecting the data first-hand. The outputs of this work will help pinpoint interventions and guide further research into the factors that may lead to predicting and prescribing what works well for stopping people from becoming victims of slavery.
Actually Existing Tech(and math)
Peter York from BCT Partners provided a primer on big data and data science including the reality-check that most of the work is the unsexy “ETL,” or the extraction, transformation, and loading of data. He contextualized the potential of the so-called big data revolution by reminding participants that the V’s of big data, Velocity, Volume, and Variety, are made possible by the technological and social infrastructure of increasingly networked populations and how these digital connections enable the monitoring, capturing, and tracking of ever increasing aspects of our lives in an unprecedented way. He shared,“A lot of what we’ve done in research were hacks because we couldn’t reach entire populations.” With advances in the tech stacks and infrastructure that connect people and their internet-connected devices with each other and the cloud, the utility of inferential statistics and experimental design lessens when entire populations of users are producing observational behavior data. When this occurs, evaluators can apply machine learning to discover the naturally occurring experiments in big data sets, what Peter terms‘Data-driven Quasi-Experimental Design.’ This is exactly what Peter does when he builds causal models to predict and prescribe better programs for child welfare and juvenile justice to automate outcome evaluation, taking cues from precision medicine.
One example of a naturally occurring experiment was the 1854 Broad Street cholera outbreak in which physician John Snow used a dot map to identify a pattern that revealed the source of the outbreak, the Broad Street water pump. By finding patterns in the data John Snow was able to lay the groundwork for rejecting the false Miasma Theory and replace it with a proto-typical Germ Theory. And although he was already skeptical of miasma theory, by using the data to inform his theory-building he was also practicing a form of proto-typical Grounded Theory. Grounded theory is simply building theory inductively, after data collection and analysis, not before, resulting in theory that is grounded in data. Peter explained,“Machine learning is Grounded Theory on steroids. Once we’ve built the theory, found the pattern by machine learning, we can go back and let the machine learning test the theory.” In effect, machine learning is like having a million John Snows to pour over data to find the naturally occurring experiments or patterns in the maps of reality that are big data.
A key aspect of the value of applying machine learning in big data is that patterns more readily present themselves in datasets that are‘wide’ as opposed to‘tall.’ Peter continued,“If you are used to datasets you are thinking in rows. However, traditional statistical models break down with more features, or more columns.” So, Peter and evaluators like him that are applying data science to their evaluative practice are evolving from traditional Frequentist to Bayesian statistical approaches. While there is more to the distinction here, the latter uses prior knowledge, or degrees of belief, to determine the probability of success, where the former does not. This distinction is significant for evaluators who are wanting to move beyond predictive correlation to prescriptive evaluation. Peter expounded,“Prescriptive analytics is figuring out what will best work for each case or situation.” For example, with prediction, we can make statements that a foster child with certain attributes is 70% not likely to find a home. Using the same data points with prescriptive analytics we can find 30 children that are similar to that foster child and find out what they did to find a permanent home. In a way, only using predictive analytics can cause us to surrender while including prescriptive analytics can cause us to endeavor.
The last category of existing resources for applying big data for evaluation was mostly captured by the comments of independent evaluation consultant, Michael Bamberger. He spoke of the latent capacity that existed in evaluation professionals and teams, but that we’re not taking full advantage of big data: “Big data is being used by development agencies, but less by evaluators in these agencies. Evaluators don’t use big data, so there is a big gap.”
He outlined two scenarios for the future of evaluation in this new wave of data analytics: a state of divergence where evaluators are replaced by big data analysts and a state of convergence where evaluators develop a literacy with the principles of big data for their evaluative practice. One problematic consideration with this hypothetical is that many data scientists are not interested in causation, as Peter York noted. To move toward the future of convergence, he shared how big data can enhance the evaluation cycle from appraisal and planning through monitoring, reporting and evaluating sustainability. Michael went on to share a series of caveats emptor that include issues with extractive versus inclusive uses of big data, the fallacy of large numbers, data quality control, and different perspectives on theory, all of which could warrant their own blog posts for development evaluation.
While I deepened my basic understandings of data analytics including the tools and techniques, benefits and challenges, and guidelines for big data and evaluation, my biggest take away is reconsidering big data for social good by considering the ethical dilemma of not using existing data, tech, and capacity to improve development programs, possibly even prescribing specific interventions by identifying their probable efficacy through predictive models before they are deployed.
As we all know, big data and data science are becoming increasingly important in all aspects of our lives. There is a similar rapid growth in the applications of big data in the design and implementation of development programs. Examples range from the use of satellite images and remote sensors in emergency relief and the identification of poverty hotspots, through the use of mobile phones to track migration and to estimate changes in income (by tracking airtime purchases), social media analysis to track sentiments and predict increases in ethnic tension, and using smart phones on Internet of Things (IOT) to monitor health through biometric indicators.
Despite the rapidly increasing role of big data in development programs, there is speculation that evaluators have been slower to adopt big data than have colleagues working in other areas of development programs. Some of the evidence for the slow take-up of big data by evaluators is summarized in “The future of development evaluation in the age of big data”. However, there is currently very limited empirical evidence to test these concerns.
To try to fill this gap, my colleagues Rick Davies and Linda Raftree and I would like to invite those of you who are interested in big data and/or the future of evaluation to complete the attached survey. This survey, which takes about 10 minutes to complete asks evaluators to report on the data collection and data analysis techniques that you use in the evaluations you design, manage or analyze; while at the same time asking data scientists how familiar they are with evaluation tools and techniques.
The survey was originally designed to obtain feedback from participants in the MERL Tech conferences on “Exploring the Role of Technology in Monitoring, Evaluation, Research and Learning in Development” that are held annually in London and Washington, DC, but we would now like to broaden the focus to include a wider range of evaluators and data scientists.
One of the ways in which the findings will be used is to help build bridges between evaluators and data scientists by designing integrated training programs for both professions that introduce the tools and techniques of both conventional evaluation practice and data science, and show how they can be combined to strengthen both evaluations and data science research. “Building bridges between evaluators and big data analysts” summarizes some of the elements of a strategy to bring the two fields closer together.
The findings of the survey will be shared through this and other sites, and we hope this will stimulate a follow-up discussion. Thank you for your cooperation and we hope that the survey and the follow-up discussions will provide you with new ways of thinking about the present and potential role of big data and data science in program evaluation.
This year at MERL Tech DC, in addition to the regular conference on September 6th and 7th, we’re offering two full-day, in-depth workshops on September 5th. Join us for a deeper look into the possibilities and pitfalls of Blockchain for MERL and Big Data for Evaluation!
What can Blockchain offer MERL? with Shailee Adinolfi, Michael Cooper, and Val Gandhi, co-hosted by Chemonics International, 1717 H St. NW, Washington, DC 20016.
Tired of the blockchain hype, but still curious on how it will impact MERL? Join us for a full day workshop with development practitioners who have implemented blockchain solutions with social impact goals in various countries. Gain knowledge of the technical promises and drawbacks of blockchain technology as it stands today and brainstorm how it may be able to solve for some of the challenges in MERL in the future. Learn about ethical design principles for blockchain and how to engage with blockchain service providers to ensure that your ideas and programs are realistic and avoid harm. See the agenda here.
Big Data and Evaluation with Michael Bamberger, Kerry Bruce and Peter York, co-hosted by the Independent Evaluation Group at the World Bank – “I” Building, Room: I-1-200, 1850 I St NW, Washington, DC 20006
Join us for a one-day, in-depth workshop on big data and evaluation where you’ll get an introduction to Big Data for Evaluators. We’ll provide an overview of applications of big data in international development evaluation, discuss ways that evaluators are (or could be) using big data and big data analytics in their work. You’ll also learn about the various tools of data science and potential applications, as well as run through specific cases where evaluators have employed big data as one of their methods. We will also address the important question as to why many evaluators have been slower and more reluctant to incorporate big data into their work than have their colleagues in research, program planning, management and other areas such as emergency relief programs. Lastly, we’ll discuss the ethics of using big data in our work. See the agenda here!
At MERL Tech London, 2018, we invited Michael Bamberger and Rick Davies to debate the question of whether the enthusiasm for Big Data in Evaluation is warranted. At their session, through a formal debate (skillfully managed by Shawna Hoffman from The Rockefeller Foundation) they discussed whether Big Data and Evaluation would eventually converge, whether one would dominate the other, how can and should they relate to each other, and what risks and opportunities there are in this relationship.
Following the debate, Michael and Risk wanted to continue the discussion — this time exploring the issues in a more conversational mode on the MERL Tech Blog, because in practice both of them see more than one side to the issue.
So, what do Rick and Michael think — will big data integrate with evaluation — or is it all just hype?
Rick: In the MERL Tech debate I put a lot of emphasis on the possibility that evaluation, as a field, would be overwhelmed by big data / data science rhetoric. But since then I have been thinking about a countervailing development, which is that evaluative thinking is pushing back against unthinking enthusiasm for the use of data science algorithms. I emphasise “evaluative thinking” rather than “evaluators” as a category of people, because a lot of this pushback is coming from people who would not identify themselves as evaluators. There are different strands to this evaluative response.
One is a social justice perspective, reflected in recent books such as “Weapons of Math Destruction”, “Automated Inequality”, and “Algorithms of Oppression” which emphasise the human cost of poorly designed and or poorly supervised use of algorithms using large amounts of data to improve welfare and justice administration. Another strand is more like a form of exploratory philosophy, and has focused on how it might be possible to define “fairness” when designing and evaluating algorithms that have consequences for human welfare[ See 1, 2, 3, 4]. Another strand is perhaps more technical in focus, but still has a value concern. This is the literature on algorithmic transparency. Without transparency it is difficult to assess fairness [See 5, 6, ] Neural networks are often seen as a particular challenge. Associated with this are discussions about “the right to explanation” and what this means in practice[1,]
In parallel there is also some infiltration of data science thinking into mainstream evaluation practice. DFID is funding the World Bank’s Strategic Impact Evaluation Fund (SIEF) latest call for “nimble evaluations” . These are described as rapid and low cost and likely to take the form of an RCT but ones which are focused on improving implementation rather than assessing overall impact . This type of RCT is directly equivalent to A/B testing used by the internet giants to improve the way their platforms engage with their users. Hopefully these nimble approaches may bring a more immediate benefit to the people’s lives than RCTs which have tried to assess the impact of a whole project and then inform the design of subsequent projects.
Another recent development is the World Bank’s Data Science competition , where participants are being challenged to develop predictive models of household poverty status, based on World Bank Household Survey data. The intention is that they should provide a cheaper means of identifying poor households than simply relying on what can be very expensive and time consuming nationwide household surveys. At present the focus on the supporting website is very technical. As far as I can see there is no discussion of how the winning prediction model will be used and an how any risks of adverse effects might be monitored and managed. Yet as I suggested at MERLTech London, most algorithms used for prediction modelling will have errors. The propensity to generate False Positives and False Negatives is machine learning’s equivalent of original sin. It is to be expected, so it should be planned for. Plans should include systematic monitoring of errors and a public policy for correction, redress and compensation.
Michael: These are both important points, and it is interesting to think what conclusions we can draw for the question before us. Concerning the important issue of algorithmic transparency (AT), Rick points out that a number of widely discussed books and articles have pointed out the risk that the lack of AT poses for democracy and particularly for poor and vulnerable groups. Virginia Eubanks, one of the authors cited by Rick, talks about the “digital poorhouse” and how unregulated algorithms can help perpetuate an underclass. However, I think we should examine more carefully how evaluators are contributing to this discussion. My impression, based on very limited evidence is that evaluators are not at the center — or even perhaps the periphery — of this discussion. Much of the concern about these issues is being generated by journalists, public administration specialists or legal specialists. I argued in an earlier MERL Tech post that many evaluators are not very familiar with big data and data analytics and are often not very involved in these debates. This is a hypothesis that we hope readers can help us to test.
Rick’s second point, about the infiltration of data science into evaluation is obviously very central to our discussion. I would agree that the World Bank is one of the leaders in the promotion of data science, and the example of “nimble evaluation” may be a good example of convergence between data science and evaluation. However, there are other examples where the Bank is on the cutting edge of promoting new information technology, but where potential opportunities to integrate technology and evaluation do not seem to have been taken up. An example would be the Bank’s very interesting Big Data Innovation Challenge, which produced many exciting new applications of big data to development (e.g. climate smart agriculture, promoting financial inclusion, securing property rights through geospatial data, and mapping poverty through satellites). The use of data science to strengthen evaluation of the effectiveness of these interventions, however, was not mentioned as one of the objectives or outputs of this very exciting program.
It would also be interesting to explore to what extent the World Bank Data Science competition that Rick mentions resulted in the convergence of data science and evaluation, or whether it was simply testing new applications of data science.
Finally, I would like to mention two interesting chapters in Cybersociety, Big Data and Evaluation edited by Petersson and Breul (2017, Transaction Publications). One chapter (by Hojlund et al) reports on a survey which found that only 50% of professional evaluators claimed to be familiar with the basic concepts of big data, and only about 10% reported having used big data in an evaluation. In another chapter, Forss and Noren reviewed a sample of Terms of Reference (TOR) for evaluations conducted by different development agencies, where they found that none of the 25 TOR specifically required the evaluators to incorporate big data into their evaluation design.
It is difficult to find hard evidence on the extent to which evaluators are familiar with, sympathetic to, or using big data into their evaluations, but the examples mentioned above show that there are important questions about the progress made towards the convergence of evaluation and big data.
We invite readers to share their experiences both on how the two professions are starting to converge, or on the challenges that slow down, or even constrain the process of convergence.