FHI 360 Academy Hall, 8th Floor 1825 Connecticut Avenue NW Washington, DC 20009
We gathered at the first MERL Tech Conference in 2014 to discuss how technology was enabling the field of monitoring, evaluation, research and learning (MERL). Since then, rapid advances in technology and data have altered how most MERL practitioners conceive of and carry out their work. New media and ICTs have permeated the field to the point where most of us can’t imagine conducting MERL without the aid of digital devices and digital data.
The rosy picture of the digital data revolution and an expanded capacity for decision-making based on digital data and ICTs has been clouded, however, with legitimate questions about how new technologies, devices, and platforms — and the data they generate — can lead to unintended negative consequences or be used to harm individuals, groups and societies.
Join us in Washington, DC, on September 5-6 for this year’s MERL Tech Conference where we’ll be taking stock of changes in the space since 2014; showcasing promising technologies, ideas and case studies; sharing learning and challenges; debating ideas and approaches; and sketching out a vision for an ideal MERL future and the steps we need to take to get there.
Tech and traditional MERL: How is digital technology enabling us to do what we’ve always done, but better (consultation, design, community engagement, data collection and analysis, databases, feedback, knowledge management)? What case studies can be shared to help the wider sector learn and grow? What kinks do we still need to work out? What evidence base exists that can support us to identify good practices? What lessons have we learned? How can we share these lessons and/or skills with the wider community?
Data, data, and more data: How are new forms and sources of data allowing MERL practitioners to enhance their work? How are MERL Practitioners using online platforms, big data, digitized administrative data, artificial intelligence, machine learning, sensors, drones? What does that mean for the ways that we conduct MERL and for who conducts MERL? What concerns are there about how these new forms and sources of data are being used and how can we address them? What evidence shows that these new forms and sources of data are improving MERL (or not improving MERL)? What good practices can inform how we use new forms and sources of data? What skills can be strengthened and shared with the wider MERL community to achieve more with data?
Emerging tools and approaches: What can we do now that we’ve never done before? What new tools and approaches are enabling MERL practitioners to go the extra mile? Is there a use case for blockchain? What about facial recognition and sentiment analysis in MERL? What are the capabilities of these tools and approaches? What early cases or evidence is there to indicate their promise? What ideas are taking shape that should be tried and tested in the sector? What skills can be shared to enable others to explore these tools and approaches? What are the ethical implications of some of these emerging technological capabilities?
The Future of MERL: Where should we be going and what should the future of MERL look like? What does the state of the sector, of digital data, of technology, and of the world in which we live mean for an ideal future for the MERL sector? Where do we need to build stronger bridges for improved MERL? How should we partner and with whom? Where should investments be taking place to enhance MERL practices, skills and capacities? How will we continue to improve local ownership, diversity, inclusion and ethics in technology-enabled MERL? What wider changes need to happen in the sector to enable responsible, effective, inclusive and modern MERL?
Cross-cutting themes include diversity, inclusion, ethics and responsible data, and bridge-building across disciplines.
You’ll join some of the brightest minds working on MERL across a wide range of disciplines – evaluators, development and humanitarian MERL practitioners, small and large non-profit organizations, government and foundations, data scientists and analysts, consulting firms and contractors, technology developers, and data ethicists – for 2 days of in-depth sharing and exploration of what’s been happening across this multidisciplinary field and where we should be heading.
There is no real
evidence base about what does and does not work for applying blockchain
technology to interventions seeking social impacts. Most current blockchain interventions are
driven by developers (programmers) and visionary entrepreneurs. There is little
thinking in current blockchain interventions around designing for “social”
impact (there is an over abundant trust in technology to achieve the outcomes
and little focus on the humans interacting with the technology) and integrating
relevant evidence from behavioral economics, behavior change design, human
centered design, etc.
To build the needed evidence base, Monitoring, Evaluation, Research and Learning (MERL) practitioners will have to not only get to know the broad strokes of blockchain technology but the specifics of token design and tokenomics (the political economics of tokenized ecosystems). Token design could become the focal point for MERL on blockchain interventions since:
If not all, the vast majority of blockchain interventions will involve some type of desired behavior change
The token provides the link between the ledger (which is the blockchain) and the social ecosystem created by the token in which the behavior change is meant to happen
Hence the token is the “nudge” meant to leverage behavior change in the social ecosystem while governing the transactions on the blockchain ledger.
(While this blog will focus on these points, it will not go into a full discussion of what tokens are and how they create ecosystems. But there are some very good resources out there that do this which you can review at your leisure and to the degree that works for you. The Complexity Institute has published a book exploring the various attributes of complexity and main themes involved with tokenomics while Outlier Ventures has published, what I consider, to be the best guidance on token design. The Outlier Ventures guidance contains many of the tools MERL practitioners will be familiar with (problem analysis, stakeholder mapping, etc.) and should be consulted.)
Hence it could be that by understanding token design and its requirements and mapping it against our current MERL thinking, tools and practices, we can develop new thinking and tools that could be the beginning point in building our much-needed evidence base.
What is a “blockchain intervention”?
As MERL practitioners
we roughly define an “intervention” as a group of inputs and activities meant
to leverage outcomes within a given eco-system.
“Interventions” are what we are usually mandated to asses, evaluate and
When thinking about MERL and blockchain, it is useful to think of two categories of “blockchain interventions”.
1) Integrating the blockchain into MERL data collection, entry, management, analysis or dissemination practices and
2) MERL strategies for interventions using the blockchain in some way shape or form.
Here we will focus on the #2 and in so doing demonstrate that while the blockchain is an innovative, potentially disruptive technology, evaluating its applications on social outcomes is still an issue of assessing behavior change against dimensions of intervention design.
Designing for Behavior Change
We generally design
interventions (programs, projects, activities) to “nudge” a certain type of behavior (stated as
outcomes in a theory of change) amongst a certain population (beneficiaries,
stakeholders, etc.). We often attempt to
integrate mechanisms of change into our intervention design, but often do not
for a variety of reasons (lack of understanding, lack of resources, lack of
political will, etc.). This lack of due
diligence in design is partly responsible for the lack of evidence around what
works and what does not work in our current universe of interventions.
Enter blockchain technology, which as MERL practitioners, we will be responsible for assessing in the foreseeable future. Hence, we will need to determine how interventions using the blockchain attempt to nudge behavior, what behaviors they seek to nudge, amongst whom, when and how well the design of the intervention accomplishes these functions. In order to do that we will need to better understand how blockchains use tokens to nudge behavior.
The Centrality of the Token
We have all used tokens before. Stores issue coupons that can only be used at those stores, we get receipts for groceries as soon as we pay, arcades make you buy tokens instead of just using quarters. The coupons and arcade tokens can be considered utility tokens, meaning that they can only be used in a specific “ecosystem” which in this case is a store and arcade respectively. The grocery store receipt is a token because it demonstrates ownership, if you are stopped on the way out the store and you show your receipt you are demonstrating that you now have rights to ownership over the foodstuffs in your bag.
Whether you realize
it or not at the time, these tokens are trying to nudge your behavior. The store gives you the coupon because the
more time you spend in their store trying to redeem coupons, the greatly
likelihood you will spend additional money there. The grocery store wants you to pay for all
your groceries while the arcade wants you to buy more tokens than you end up
If needed, we could design
MERL strategies to assess how well these different tokens nudged the desired
behaviors. We would do this, in part, by thinking about how each token is
designed relative to the behavior it wants (i.e. the value, frequency and
duration of coupons, etc.).
Thinking about these ecosystems and their respective tokens will help us understand the interdependence between 1) the blockchain as a ledger that records transactions, 2) the token that captures the governance structures for how transactions are stored on the blockchain ledger as well as the incentive models for 3) the mechanisms of change in the social eco-system created by the token.
Figure #1: The inter-relationship between the blockchain
(ledger), token and social eco-system
Token Design as Intervention Design
Just as we assess
theories of change and their mechanisms against intervention design, we will
assess blockchain based interventions against their token design in much the
same way. This is because blockchain
tokens capture all the design dimensions of an intervention; namely the problem
to be solved, stakeholders and how they influence the problem (and thus the
solution), stakeholder attributes (as mapped out in something like a
stakeholder analysis), the beneficiary population, assumptions/risks, etc.
Outlier Ventures has adapted what they call a Token
Utility Canvas as a milestone in
their token design process. The canvas
can be correlated to the various dimensions of an evaluability
assessment tool (I am using the evaluability
assessment tool as a demonstration of the necessary dimensions of an
interventions design, meaning that the evaluability assessment tool assesses
the health of all the components of an intervention design). The Token Utility Canvas is a useful
milestone in the token design process that captures many of the problem
diagnostic, stakeholder assessment and other due diligence tools that are
familiar to MERL practitioners who have seen them used in intervention
design. Hence token design could be
largely thought of as intervention design and evaluated as such.
Comparing Token Design with Dimensions of Program Design (as represented in an
This table is not meant to be exhaustive and not all of the fields will be explained here but in general, it could be a useful starting point in developing our own thinking and tools for this emerging space.
The Token as a Tool
for Behavior Change
Coming up with a taxonomy of blockchain interventions and relevant tokens is a necessary task, but all blockchains that need to nudge behavior will have to have a token.
Consider supply chain management. Blockchains are increasingly being used as the ledger system for supply chain management. Supply chains are typically comprised of numerous actors packaging, shipping, receiving, applying quality control protocols to various goods, all with their own ledgers of the relevant goods as they snake their way through the supply chain. This leads to ample opportunities for fraud, theft and high costs associated with reconciling the different ledgers of the different actors at different points in the supply chain. Using the blockchain as the common ledger system, many of these costs are diminished as a single ledger is used with trusted data, hence transactions (shipping, receiving, repackaging, etc.) can happen more seamlessly and reconciliation costs drop.
However even in “simple” applications such as this there are behavior change implications. We still want the supply chain actors to perform their functions in a manner that adds value to the supply chain ecosystem as a whole, rewarding them for good behavior within the ecosystem and punishing for bad.
What if those shippers trying to pass on a faulty product had
already deposited a certain value of currency in an escrow account (housed in a
contract on the blockchain)? Meaning that if they are found to be
attempting a prohibited behavior (passing on faulty products) they surrender a
certain amount automatically from the escrow account in the blockchain smart
contract. How much should be deposited
in the escrow account? What is the ratio
between the degree of punishment and undesired action? These are behavior questions around a
mechanism of change that are dimensions of current intervention designs and will
be increasingly relevant in token design.
The point of this is to demonstrate that even “benign”
applications of the blockchain, like supply chain management, have behavior
change implications and thus require good due diligence in token design.
There is a lot that could be said about the validation function
of this process, who validates that the bad behavior has taken place and should
be punished or that good behavior should be rewarded? There are lessons to be learned from results
based contracting and the role of the validator in such a contracting
vehicle. This “validating” function will
need to be thought out in terms of what can be automated and what needs a
“human touch” (and who is responsible, what methods they should use,
Implications for MERL
If tokens are fundamental to MERL strategies for blockchain
interventions, there are several critical implications:
MERL practitioners will need to be heavily integrated into the due diligence processes and tools for token design
MERL strategies will need to be highly formative, if not developmental, in facilitating the timeliness and overall effectiveness of the feedback loops informing token design
New thinking and tools will need to be developed to assess the relationships between blockchain governance, token design and mechanisms of change in the resulting social ecosystem.
The opportunity cost for impact and “learning” could go up the less MERL practitioners are integrated into the due diligence of token design. This is because the costs to adapt token design are relatively low compared to current social interventions, partly due to the ability to integrate automated feedback.
Blockchain based interventions present us with significant learning opportunities due to our ability to use the technology itself as a data collection/management tool in learning about what does and does not work. Feedback from an appropriate MERL strategy could inform decision making around token design that could be coded into the token on an iterative basis. For example as incentives of stakeholder’s shift (i.e. supply chain shippers incur new costs and their value proposition changes) token adaptation can respond in a timely fashion so long as the MERL feedback that informs the token design is accurate.
There is need to determine what components of these feedback
loops can be completed by automated functions and what requires a “human
touch”. For example, what dimensions of
token design can be informed by smart infrastructure (i.e. temp gauges on
shipping containers in the supply chain) versus household surveys completed by
enumerators? This will be a task to
complete and iteratively improve starting with initial token design and lasting
through the lifecycle of the intervention.
Token design dimensions, outlined in the Token Utility Canvas, and decision-making
will need to result in MERL questions that are correlated to the best strategy
to answer them, automated or human, much the same as we do now in current
While many of our current due diligence tools used in both
intervention and evaluation design (things like stakeholder mapping, problem
analysis, cost benefit analysis, value propositions, etc.), will need to be
adapted to the type of relationships that are within a tokenized eco-systems. These include the relationships of influence
between the social eco-system as well as the blockchain ledger itself (or more
specifically the governance of that ledger) as demonstrated in figure #1.
This could be our, as MERL practitioners, biggest priority. While blockchain interventions could create incredible opportunities for social experimentation, the need for human centered due diligence (incentivizing humans for positive behavior change) in token design is critical. Over reliance on the technology to drive social outcomes is already a well evidenced opportunity cost that could be avoided with blockchain-based solutions if the gap between technologists, social scientists and practitioners can be bridged.
Guest post from Haanim Galvaan, Content Designer at Every1Mobile
A phone is no longer just a phone. It’s your connection to the rest of the world, it’s your personal assistant, and now, it’s your best friend who gives you encouragement and reinforcement for your good habits.
At least that’s what mobile phones have become for those who make use of habit-boosting apps.
If you’re trying to quit smoking and want to build a streak of puff-free days, the HabitBull app can help you do that. Want to establish a habit in your team that makes use of social accountability? Try Habitica. Do you want positive reinforcement for your activities in a motivational, rewarding voice? Productive is the app for that.
But what if you’re a young mum, living in the urban slums of Nairobi and you want to improve the health and wellbeing of your children? Try U Afya’s 10-Day Challenge.
U Afya is an online community for young mothers and mothers-to-be to learn about topics related to health, hygiene and family life. The site takes a holistic approach to giving young mothers the knowledge and confidence they need to enact certain healthy behaviours. It’s a place to discuss, give and receive advice, take free online courses, and now, to establish good habits with a custom-built habit tracking tool.
The 10-Day Handwashing Challenge was launched using new habit-tracking functionality. Users were encouraged to perform an activity related to handwashing each day, e.g. wash your hands with soap for 20 seconds. The challenges were formulated around the Lifebuoy “5 Key Moments” model. Participants were required to log their activity on the site by completing a survey.
Each day the site fed users a different hygiene-related tip, as well as links to additional content. At the end of the challenge, users were pushed to take a pledge and make a commitment to handwashing.
U Afya’s Habit Tracker is different from other habit boosting apps in that it is not an app! It has been built onto a low-data usage site that has been optimised for the data-sensitive target audience in the Nairobi slums. The tracker provides a rich, visual experience, which makes use of simple functionality compatible on both feature phone and smartphone.
We created a sense of urgency.
Users were required to log their activity for 10 days within a 30-day period. Attaching a “deadline” added a measure of urgency to the activity. There is no space for procrastination. The message is: establish your habit now or you never will!
It is based on behaviour change levers.
The 10-Day Handwashing Challenge and its accompanying content around the site were all based on the behaviour change approach employed by Lifebuoy inWay of Life, namely Awareness, Commitment, Reinforcement and Reward.
The approach was executed in the following ways:
Awareness: Introducing the handwashing theme with engaging, educational content that linked to and from the 10-Day Handwashing Challenge:
Diseases caused by lack of handwashing (article)
5 Tips for washing your hands correctly (article)
Global Handwashing Day! – The 5 Key times to wash our hands (article)
How much do you know about handwashing? (quiz)
Commitment: Encouraging users to take the Handwashing Pledge
Reinforcement: Habit tracker, come back to self-report your daily activity
Reward: Participants stood the chance to win a hygiene gift bag
Contents of the hygiene gift bag given to 5 winners.
86 users started the challenge and 26 users completed it within the 30-day challenge period. That makes a completion rate of 30% overall. Considering that users had to return to the challenge 10 times, the response rate is quite high.
The biggest drop-off happened between Day 1 and Day 2, with 28 users falling away and drop-off rates decreased gradually throughout the 10 days. The graph below shows that most users who made it to day 5 ended up completing the challenge. Only 11 users dropped off between Day 5 and Day 10.
26 out of 86 users created a habit.
In addition to participatory data, additional feedback was gathered by interspersing survey questions into the challenge. This additional questioning determined that 91% of challenge-takers feel they can afford to buy soap for their families.
Users had overwhelmingly positive feedback about the challenge.
“It was so educating and hygienically I have improved. It’s now a routine to me, washing hands in any case”
Keep it simple
It’s not always necessary to create a fancy app to push a new activity. The U Afya 10-Day Challenge was built on a platform that users are already familiar with. By building it into their current environment, it offered them something new and exciting on their visit.
Users were required to do one thing each day and report it with one action i.e. taking a single-question survey. Requiring minimal effort from your users can maximise uptake.
Overall the approach was simplicity. Simplicity in the design of the functionality, simplicity in the daily action and simplicity in creating a habit.
With this approach the U Afya 10-Day Handwashing Challenge helped 26 young mothers to create a new habit of washing their hands every day at key moments.
This first iteration of U Afya’s 10-Day Handwashing Challenge was a pilot, but the results suggest that it is possible to use low-cost, low-tech means to encourage habit formation. It is also possible for sophisticated behaviour change theory and practice to reach some of the most vulnerable groups, using the very phones they have in their hands.
It is also a useful tool to help us to understand the impact of our behaviour change campaigns in the real world.
All the user feedback and learnings mentioned above will be analysed to understand how the approach can be strengthened to reach even more people, increase compliance and and encourage positive habit creation.
In many developing country environments, it is difficult or impossible to obtain recent, reliable estimates of human development. Nationally representative household surveys, which are the standard instrument for determining development policy and priorities, are typically too expensive to collect with any regularity.
Recently, however, researchers have shown the potential for remote sensing technologies to provide a possible solution to this data constraint. In particular, recent work indicates that satellite imagery can be processed with deep neural networks to accurately estimate the sub-regional distribution of wealth in sub-Saharan Africa.
Testing Neural Networks to Process Satellite Imagery
In the paper, Can Human Development be Measured with Satellite Imagery?, Andrew Head, Mélanie Manguin, Nhat Tran, and Joshua Blumenstock explore the extent to which the same approach – of using convolutional neural networks to process satellite imagery – can be used to measure a broader set of human development indicators, in a broader range of geographic contexts.
Their analysis produces three main results:
They successfully replicate prior work showing that satellite images can accurately infer a wealth-based index of poverty in sub-Saharan Africa.
They show that this approach can generalize to predicting poverty in other countries and continents, but that the performance is sensitive to the hyperparameters used to tune the learning algorithm.
They find that this approach does not trivially generalize to predicting other measures of development such as educational attainment, access to drinking water, and a variety of health-related indicators.
This paper shows that while satellite imagery and machine learning may provide a powerful paradigm for estimating the wealth of small regions in sub-Saharan Africa, the same approach does not trivially generalize to other geographical contexts or to other measures of human development.
In this assessment, it is important to emphasize what they mean by “trivially,” because in truth the point they are making is somewhat circumspect. Specifically, what they have shown is that the exact framework—of retraining a deep neural network on night-lights data, and then using those features to predict the wealth of small regions in sub-Saharan Africa—cannot be directly applied to predicting arbitrary indicators in any country with uniformly good results.
This is an important point to make because absent empirical evidence to the contrary, it is likely that policymakers eager to gain quick access to micro-regional measurements of development might be tempted to do exactly what they have done in this paper, without paying careful attention to the thorny issues of generalizability that they have uncovered in this analysis.
It is not the researchers’ intent to impugn the potential for related approaches to provide important new methods for measuring development, but rather to say that such efforts should proceed with caution, and with careful validation.
Why Satellite Imagery Might Fail to Predict Development
The results showed that while some indicators like wealth and education can be predicted reasonably well in many countries, other development indicators are much more brittle, exhibiting high variance between and within countries, and others perform poorly everywhere.
Thus it is useful to distinguish between two possible reasons why the current approach may have failed to generalize to these measures of development.
It may be that this exercise is fundamentally not possible, and that no amount of additional work would yield qualitatively different results.
It is quite possible that their investigation to date has been not been sufficiently thorough, and that more concerted efforts could significantly improve the performance of these models
Insufficient “signal” in the satellite imagery.
The researchers’ overarching goal is to use information in satellite images to measure different aspects of human development. The premise of such an approach is that the original satellite imagery must contain useful information about the development indicator of interest. Absent of such a signal, no matter how sophisticated our computational model, the model is destined to fail.
The fact that wealth specifically can be measured from satellite imagery is quite intuitive. For instance, there are visual features one might expect correlate with wealth—large buildings, metals roofs, nicely paved roads, and so forth.
It may be the case that other measures of human development cannot be seen from above. For instance, it may be a fundamentally difficult task to infer the prevalence of malnutrition from satellite imagery, if the regions with high and low rates of malnutrition appear similar, even though they hypothesize that these indices should correlate with wealth index.
They were, however, surprised by the relative under-performance of models designed to predict access to drinking water, as they expected the satellite-based features to capture proximity to bodies of water, which in turn might affect access to drinking water.
(Over-) reliance on night-lights may not generalize.
Their reliance on night lights might help explain why some indicators were predicted less successfully in some countries than others. An example in their study includes Nepal, where the accuracy in predicting access to electricity was much lower (R2 = 0.24) than in the other countries (R2 = 0.69, 0.44, and 0.54 in Rwanda, Nigeria, and Haiti, respectively).
This may be partly due to the fact that Nepal has a very low population density (half as dense as Haiti and Rwanda) and very high levels of electrification (twice as high as Haiti, Rwanda, and Nigeria).
If the links between electrification, night-lights, and daytime imagery are broken in Nepal, they would expect their modeling approach to fail. More generally, they expect that when a development indicator does not clearly relate to the presence of nighttime lights, it may be unreasonable to expect good performance from the transfer learning process as a whole.
Deep learning vs. supervised feature engineering.
In this paper, the researchers focused explicitly on using the deep/transfer learning approach to extracting information from satellite images. While powerful, it is also possible that other approaches to feature engineering might be more successful than the brute force approach of the convolutional neural network.
For instance, Gros and Tiecke have recently shown how hand-labeled features from satellites, and specifically information about the types of buildings that are present in each image, can be quite effective in predicting population density. Labeling images in this manner is resource intensive, and they did not have the opportunity to test such approaches.
However, they believe that careful encoding of the relevant information from satellite imagery would likely bolster the performance of specific prediction tasks.
Neural Networks Can Still Process Satellite Imagery
Broadly, the researchers remain optimistic that future work using novel sources of data and new computational algorithms can engender significant advances in the measurement of human development.
However, it is imperative that such work proceeds carefully, with appropriate benchmarking and external calibration. Promising new tools for measurement have the potential to be implemented widely, possibly by individuals who do not have extensive expertise in the underlying algorithms.
Applied blindly, these algorithms have the potential to skew subsequent policy in unpredictable and undesirable ways. They view the results of this study as a cautionary example of how a promising algorithm should not be expected to work “off the shelf” in a context that is significantly different from the one in which it was originally developed.
Users often ask us, “what response rate will I get from my survey?”, or “how can I increase my survey’s response rate?”
The truth is …. it depends!
Response rates depend on your organisation, your respondents, and their motivation for responding. Most of our users assume that financial incentives are the most effective for stimulating engagement, and indeed research shows they can enhance response rates. But they are not always necessary and rarely sufficient. The design of your survey — its structure, tone and content — is equally important and often ignored.
In a recent SMS survey conducted for the third time on behalf of a UN agency and government ministry, Echo’s Deployment team demonstrated that minor adjustments to survey design can drastically increase response rates, regardless of financial incentives.
In May 2017, the team sent a survey with a KES 35 airtime incentive to 25,000 Kenyan government employees, 21% of whom completed it. In October 2017, Deployment sent the same survey to the same group with the same airtime incentive. This time only 16% completed it. In February 2018, we sent the survey again, with minor design tweaks and no financial incentives. The completion rate nearly doubled to 29%.
Win-win! Our client saved money by dropping the airtime transfers and got more results. More of their beneficiaries were able to engage and provide critical feedback. Here are the design changes we made to the survey. Consider them next time you’re using Echo for Monitoring and Evaluation (M&E):
Personalize the content
The Echo Platform allows users to personalize messages using standard fields — basic, common data points like name, ID and location, which can be stored in Echo contact profiles and can be integrated into large-scale messages.
Unlike in 2017, in the 2018 version of the UN survey, our Deployment team added the NAME field to the first SMS. As a result, all recipients immediately saw their name before automatically progressing to the first question. This builds a sense of trust, captures recipients’ attention, and is less likely to be mistaken for spam.
And you don’t need to to just stick to standard fields! Any prior response to a survey can be stored as a custom field. If you ask recipients their favorite football team and store the response as a custom field, the next time you send them SMS you can personalize your content even further: “Hi [NAME]. Hope [FOOTBALL_TEAM] is doing well this week….”
Skip the “opt-in”
The Echo platform’s survey builder allows you to add an invitation message as the first SMS sent to a contact. To move from this intro message on to the first question, recipients must “opt-in” by responding to this initial message with something like “ok” or “begin” (any word/number will do).
Sample survey designs, before optimisation.
Invitation messages are extremely useful. They help you be polite, introduce yourself if the recipient doesn’t know you, and say what your survey is about and why and how they can proceed (more below on instructions!). But they can also create a barrier to completion.
Observing that many respondents had failed to opt in to our 2017 survey, for the 2018 version of the survey we dropped the invitation message. Instead, we took that content and sent it as an info question, which, by design, automatically progress to the next question, regardless of a response or not.
Optimised survey ; personalised, does not require the respondent to opt in, and has clear instructions on how to reply.
Removing the opt-in invitation message won’t always be an option, but in this case, respondents were employees of our client and had been engaging on their shortcode for years. In some ways the intro message just added an extra step for them, as they had already provided their phone numbers and given consent to allow our client to engage them. Personally Identifiable Information (PII) is also not collected, nor shared and the respondents have an option to unsubscribe entirely from our system by sending the word STOP at any time, an option that has been communicated to them repeatedly.
In other cases, users might be suspicious of the opt-in request. Many Kenyans have encountered premium SMS services that push messages to unknowing respondents and deduct airtime from them once they opt in. Messaging with Echo is totally free for your respondents, but consider how they might react to an opt-in intro message, and design your survey accordingly!
Give clear Instructions
Keeping in mind SMS character limit, our Deployment team added quick instructions at the end of each question in the 2018 survey. These guided the respondents on how to answer specific question types. In the prior 2017 versions, each SMS had only contained the question, without instructions on how to answer:
For the 2017 surveys, we automated a reminder, sent 24 hours after the survey to those who had not yet started or completed it. For the 2018 version we added a second reminder, sent 12 hours later.
Reminders like these nudge contacts who are willing to respond to the survey but may have become distracted before completing it. This is especially true for long surveys like the one we have been deploying for the UN, which risk respondent fatigue. Reminders are a subtle way of urging them to finish the survey. Better yet — keep it lean!
So, what’s the take away here?
While research on the potential impact of financial incentives is clear, no amount of money or airtime can make up for suboptimal survey design!
Monetary rewards can move the response rate in the margins, but not always, and only if you get the design right first. Financial incentives are complementary to a well designed survey that has useful and clear content, an efficient structure, and a personal tone.
That said, non-financial incentives — the broader reasons why your contacts might want to engage with you at all — are an extremely important consideration. Not everyone’s time and information can be bought.
Consider for your next survey or engagement what informational, relational, or emotional incentives you might be explicitly or implicitly offering up front. As with any relationship, both sides ultimately need to feel like there is some benefit to the commitment. We’ll blog more about this idea soon!
Want to learn more from the Echo Deployment team? We consult on mobile engagement strategy and techniques, and can provide implementation support for survey creation, setup, optimization, deployment, and tracking on the Echo Platform.
Guest post by Michael Cooper, a former DoS, MCC Associate Director for Policy and Evaluation who now runs Emergence. Mike advises numerous donors, private clients and foundations on program design, MEL, adaptive management and other analytical functions.
International development projects using the blockchain in
some way are increasing at a rapid
rate and our window for developing evidence around what does and does not
work (and more importantly why) is narrow before we run into un-intended
consequences. Given that blockchain is a
highly disruptive technology, these un-intended consequences could be significant,
creating a higher urgency to generate the evidence to guide how we design and
evaluate blockchain applications.
Our window for developing evidence around what does and does not work (and more importantly why) is narrow before we run into un-intended consequences.
To inform this discussion, Emergence has put out a working
paper that outlines 1.) what the blockchain is, 2.) how it can be used to
leverage behavior change outcomes in international development projects and 3.)
the implications for how we could design and evaluate blockchain based
interventions. The paper utilizes systems
and behaviorism principles in comparing how we currently design behavior change
interventions to how we could design/evaluate the same interventions using the
blockchain. This article summarizes the
main points of the paper and its conclusions to generate discussion around how
to best produce the evidence we need to fully realize the potential of
blockchain interventions for social impact.
Given the scope of possibilities surrounding the blockchain,
both in how it could be used and in the impact it could leverage, the
implications for how MEL is conducted are significant. The time is long gone where value adding MEL practitioners
are not involved in intervention design.
Blockchain based interventions will require additional integration of
MEL skill sets in the early design phases since so much will need to be
“tested” to determine what is and is not working. While rigid statistical evaluations will
needed for some of these blockchain based interventions, the level of
complexity involved and the lack of an evidence base indicate that more
flexible, adaptive and more formative MEL approaches will be needed. The more these approaches are proactive and
involved in intervention design, the more frequent and informative the feedback
loops will be into our evidence base.
The Blockchain as a Decentralizing
At its core, the blockchain is just a ledger but the
importance of ledgers in how society functions cannot
be understated. Ledgers, and the
control of them, are crucial in how supply chains are managed, financial
transactions are conducted, how data is shared, etc. Control of ledgers is a primary factor in
limiting access to life changing goods and services, especially for the worlds’
poor. In part, the discussion over decentralization
is essentially a discussion over who owns and how ledgers are managed.
has been a prominent theme in international development and there is strong
evidence of its positive impact across various sectors, especially regarding
local service delivery. One of the
primary value adds of decentralization is empowering those further from traditional
concentrations of power to have more authority over the problems that impact
them. As a decentralizing technology,
the blockchain holds a lot of potential in reaching these same impacts from
decentralization (empowerment, etc.) in a more efficient and effective manner partly
due to its ability to better align interests around common problems. With better aligned interests, less resources
(inputs) are needed to try and facilitate a desired behavior change.
Up until now, efforts of international development actors have
focused on “nudging” behavior change amongst stakeholders and in very rare
cases, such as in results based financing, give loosely defined parameters to
implementers with less emphasis on the manner in which outcomes are
achieved. Both of these approaches are
relevant in the design and testing of blockchain based interventions but they
will be integrated in unique new ways that will require new thinking and skills
sets amongst practitioners.
Current Designing and
Evaluating for Behavior Change
MEL usually starts with the relevant theory of change,
namely what mechanisms bring about targeted behavior change and how. Recent years have seen a focus on how
behavior change is achieved through an understanding
of mindsets and how they can be nudged
to achieve a social outcome. However the
international development space has recognized the limitations of designing
interventions that attempt to nudge behavior change. These limitations center around the level of
complexity involved, the inability to recognize and manage this complexity and lack
of awareness about the root causes of problems.
Hence the rise in things like results
based financing where the type of prescribed top-down causal pathway
(usually laid out in a theory of change) is not as heavily emphasized as in
more traditional interventions. Donors
using this approach can still mandate certain principles of implementation
(such as the inclusion of vulnerable populations, environmental safeguards,
timelines, etc.) but there is much more flexibility to create a causal pathway
to achieve the outcome.
Or, for example, take the popular PDIA approach where the focus is on
iteratively identifying and solving problems encountered on the pathway to
reform. These efforts do not start with
a mandated theory of change, but instead start with generally described
targeted outcomes and then the pathway to those outcomes is iteratively
created, similar to what Lant Pritchett has called “crawling
the design space”. Such an approach
has large overlaps with adaptive management practices and other more
integrative MEL frameworks and could lend themselves to how blockchain based
interventions are designed, implemented and evaluated.
How the Blockchain
Could Achieve Outcomes and Implications for MEL
Because of its decentralizing
effects, any theory of change for a blockchain based intervention could
include some possible common attributes that influence how outcomes are
Empowerment of those closest to problems to
inform the relevant solutions
Alleviation of traditional intermediary services
and relevant third party actors
Assessing these three attributes, and how they influence
outcomes, could be the foundation of any appropriate MEL strategy for a
blockchain-based intervention. This is
because these attributes are the “value add” of a blockchain-based
intervention. For example, traditional
financial inclusion interventions may seek to extend financial services of a
bank to rural areas through digital money, extension agents, etc. A blockchain-based solution, however, may cut
out the bank entirely and empower local communities to receive financial
services from completely new providers from anywhere in the world on much more
affordable terms in and in a much more convenient manner. Such a solution could see an alignment of
interests amongst producers and consumers of these services since the new
relationships are mutually serving.
Because of this alignment there is a less of a need, or even less of a
benefit, of having donors script out the causal pathway for the outcomes to be
achieved. Because of this alignment of
interests, those closest to the problem(s) and solutions can work it out
because it is in their interest to do so.
Hence while a MEL framework for such a project could still use more standardized measures around outcomes like increased access to financial services and could even use statistical methods to evaluate questions around attributable changes in poverty status; there will need to be adaptive and formative MEL that assess the dynamics of these attributes given their criticality to whether and how outcomes could be achieved. The dynamics between these attributes and the surrounding social eco-system have the potential to be very fluid (going back to the disruptive nature of blockchain technology), hence flexible MEL will be required to respond to new trends as they emerge.
Table: Blockchain Intervention Attributes and the Skill Sets
to Assess Them
Empowerment of those closest to problems to inform the
Problem driven design and MEL approach,
stakeholder mapping (to identify relevant actors) Decentralization focused MEL (MEL that focuses
on outcomes associated with decentralization)
Alignment of interests
Political economy analysis to identify
incentives and interests Adaptive MEL to assess shifting alignment of interest
between various actors
Alleviation of traditional intermediary services
Political economy analysis to inform risk
mitigation strategy for potential spoilers and relevant MEL
While there will need to be standard accountability and
other uses, feedback from an appropriate MEL strategy could have two primary
end uses in a blockchain based intervention: governance and trust.
The Role of
Governance and Trust
governance sets outs the rules for how consensus (ie. agreement) is achieved
for deciding what transactions are valid on a blockchain. While this may sound mundane it is critical
for achieving outcomes since how the blockchain is governed decides how well
those closest to the problems are empowered to identify and achieve solutions
and aligned interests. Hence the governance framework for the blockchain will
need to be informed by an appropriate MEL strategy. A giant learning gap we currently have is how
to iteratively adapt blockchain governance structures, using MEL feedback, into
increasingly more efficient versions.
Closing this gap will be critical to assessing the cost effectiveness of
blockchain based solutions over other solutions (ie. alternatives/cost benefit
analysis tools) as well as maximizing impact.
A giant learning gap we currently have is how to iteratively adapt blockchain governance structures, using MEL feedback, into increasingly more efficient versions.
Another focus of an appropriate MEL strategy would be to
facilitate trust in the blockchain-based solution amongst users much the same
as other technology-led solutions like mobile money or pay as you go metering
for service delivery. This includes not
only the digital interface between the user and the technology (a phone app,
SMS or other interface) but other dimensions of “trust” that would facilitate
uptake of the technology. These
dimensions of trust would be informed by an analysis of the barriers to uptake
of the technology amongst intended users, given it could be an entirely new
service for beneficiaries or an old service delivered in a new fashion. There is already a good evidence base around
what works in this area (ie. marketing and communication tools for digital
financial services, assistance in completing registration paperwork for pay as
you go metering, etc.).
The Road Ahead
There is A LOT we need to learn and a short time to do it in
before we feel the negative effects from a lack of preparedness. This risk is heightened when you consider
that the international development industry has a poor
track record of designing and evaluating technology-led solutions
(primarily due to the fact that these projects usually neglect uptake of the
technology and operate on the assumption that the technology will drive
outcomes instead of users using the technology as a tool to drive the
The lessons from MEL in results based financing could be
especially informative to the future of evaluating blockchain-based solutions
given their similarities in letting solutions work themselves out and the role
of the “validator” in ensuring outcomes are achieved. In fact the blockchain has already
been used in this role in some simple output based programming.
As alluded to, pre-existing MEL skill sets can add a lot of
value to building an evidence base but MEL practitioners will need to develop a
greater understanding of the attributes of blockchain technology, otherwise our
MEL strategies will not be suited to blockchain based programming.
We hear the terms “correlation” and “causation” a lot, but what do they actually mean?
Correlation: defines how two variables relate with each other when they change. When one variable increases, the other may increase, decrease or remain the same. For example, when it rains more, people tend to buy more umbrellas.
Causation: implies that one variable causes another variable to change. For example, we can confidently conclude that more rain causes more people to acquire umbrellas.
In this post, I will explore the meaning of the terms and try to explain a way of deciding how they relate. I will use a real-world example to explore and explain.
Survey completion rate correlations
Echo Mobile helps organizations in Africa engage, influence, and understand their target audience via mobile channels. Our core product is a web-based SaaS platform that, among many other things, enables users to design, send and analyze the results of mobile surveys. Our users can deploy their surveys via SMS (Short Messaging Service), USSD (Unstructured Supplementary Service Data), IVR (Interactive Voice Response), and Android apps, but SMS is the most heavily used channel.
Surveys are key to our overall mission, as they give our users a tool to better understand their target audiences — usually their customers or beneficiaries. To optimize the effectiveness of this tool, one thing that we really wanted to do was identify key factors that lead to more people completing surveys sent by our users from the Echo platform. This would enable us to advise our users on how to get more value from our platform through better engagement and understanding of their audiences.
The completion rate of a survey is the percentage of people who complete a survey after being invited to take part in it. We came up with different factors that we thought could effect the completion rate of surveys:
post_incentive: The incentive (a small amount of money or airtime) offered after completing the survey
invite_day_of_month: The date of the month a respondent was invited to the survey
invite_day_of_the_week: The day of the week a respondent was asked to take part in the survey
invite_hour: The hour of the day the respondent was invited to the survey
num_questions: The number of questions in the survey
reminded: whether the respondent was reminded to complete the survey or not
channel: The manner in which the survey was done. These were either by use of SMS, USSD, IVR, web, or Android app. SMS is the most popular channel and accounts for over 90% of surveys
completion_rate: Of those invited to a survey, the percentage that completed
We used the performance of surveys deployed from the beginning of 2017 to August of 2017 to look for the correlations between the sample factors above. The correlations between the factors are shown in the table below. Since the focus was more on how the completion rate relates with other factors, I will focus on those relationships more.
The bigger the correlation magnitude, the stronger the correlation relationship. A positive correlation indicates that when one factor is increased the other should also increase. For a negative correlation value, the relationship is inverse. When one increases, the other decreases.
The rows of the table are arranged in a descending order of the correlation between completion rate and other factors. Looking at the table, invite_hour with a positive correlation of 0.25 is the factor with strongest correlation with the completion rate. It is then followed by reminded while invite_day_of_the_month is the most negatively correlated with the completion_rate. The correlation between any other factors can also be obtained from the table, for example the correlation between number_of_questions and reminded is 0.05.
Survey completion causations?
The findings above can lead to incorrect conclusions if one is not careful. For example, a conclusion can be made that the invite hour with a correlation of 0.25 has the highest causal influence on the completion_rate of a survey. As a result, you might start trying to find the right time to send out surveys with the hope of getting more of them completed. With this mentality, it might be concluded that some invite hour is the optimum time to send out a survey. But that would be to hold to the (incorrect) idea that correlation implies causation.
The high correlation may mean that either one factor causes the other, the factors jointly cause each other, both factors are caused by the same separate third factor, or even that the correlation is as a result of coincidence.
We can, therefore, see that correlation does not always imply causation. With careful investigation, however, it is possible to more confidently conclude whether correlation implies that one variable causes the other.
How can we verify if correlation might imply causation?
1. Use statistically sound techniques to determine the relationship.
Ensure that you use statistically legitimate methods to find the correlation. These include:
use of variables that correctly quantify the relationship.
make sure there are no outliers .
ensure the sample is an appropriate representation of the population.
exposure always precedes the outcome. If A is supposed to cause B, check that A always occurs before B.
check if the relationship ties in with other existing theories.
check if the proposed relationship is similar to other relationships in related fields.
check if there is no other relationship that can explain the relationship. In the case above, a proper explanation for the headaches could be drinking instead of sleeping with shoes.
3. Validate the relationships
Conditions 1 and 2 above should be tested to determine if they are true or false. The common methods of testing are experiments and checking for consistency of the relationship. An experiment usually requires a model of the relationship, a testable hypothesis based on the model, incorporation of variance control measures, collection of suitable metrics for the relationship, and an appropriate analysis. Experiments done several times should lead to consistent conclusions.
We have not yet carried out these tests on our completion rate correlations. So we don’t yet know, for example, whether particular invite hours cause higher completion rates — only whether they are correlated.
We need to be careful before concluding that a particular relationship implies causation. It is generally better not to have a conclusion than to land on an incorrect one which might lead to wrong actions being taken!
The original version of this post was written by Rodgers Kim. Kim works at Echo Mobile as a Software Engineer and is interested in data science and enjoys writing.
by Yaquta Fatehi, Program Manager of Performance Measurement at the William Davidson Institute at the University of Michigan; and Heather Esper, Senior Program Manager of Performance Measurement at the William Davidson Institute at the University of Michigan.
The challenge: There are a number of pressing tensions and challenges in development programs related to MERL implementation. These include project teams and MERL teams working in silos and, just as importantly, leadership’s lack of understanding and commitment to MERL (as leadership often views MERL only in terms of accountability). And while there are solutions developed to address some of these challenges, our consortium, the Balanced Design, Monitoring, Evaluation, Research, and Learning (BalanceD-MERL) consortium (under U.S. Agency for International Development’s (USAID’s) MERLIN program) saw that there was still a strong need for integration of MERL in program design for good program management and adaptive management. We chose four principles – relevant, right-sized, responsible, and trustworthy – to guide this approach to enable sustainable integration of MERL with program design and adaptive management. Definitions of the principles can be found here.
How to integrate program design and MERL (a case example): Our consortium aimed to identify the benefits of such integration and application of these principles in the Women + Water Global Development Alliance program. The Alliance is a five year public/private partnership between USAID and Gap, Inc., and four other non-profit sector partners. The Alliance draws upon these organizations’ complementary strengths to improve and sustain the health and well-being of women and communities touched by the apparel industry in India. Gap, Inc. had not partnered with USAID before and had limited experience with MERL on a complex program such as this which consisted of multiple individual activities or projects implemented by multiple partners. The BalanceD-MERL consortium’s services were requested during the program design stage, to develop a rigorous program-wide, high-level, MERL strategy. We proposed co-developing the MERL activities with the Women + Water partners as listed in the MERL Strategy Template (see Table 1 in the case study shared below) – that has been developed by our consortium partner – Institute for Development Impact.
Our first step was to co-design the program’s theory of change with the Women + Water partners to establish a shared understanding of what was the problem and how it was to be addressed by the program. We used the theory of change as a communication asset that helped bring a shared understanding of the solution among partners. We found that through this process we also identified gaps in the program design that could then be addressed, in turn making the program design stronger. Grounded by the theory of change in order to be relevant and trustworthy, we co-developed a risk matrix, which was one of the most useful exercises for Gap, Inc. because it helped them place judgment on their assumptions and identify risks that needed to be frequently monitored. Following this, we co-identified the key performance indicators and associated metadata using the Performance Indicator Reference Sheets format. This exercise, done iteratively with all partners, helped them understand the tradeoffs between trustworthy and right-size; helped to ensure the feasibility of data collection and that indicators were right-sized and relevant; verified that methods were responsible and not placing unnecessary burden on key stakeholders; and confirmed that data was trustworthy enough to provide insights on the activity’s progress and changing context.
In order to integrate MERL with the program design, we closely co-created these key components with the partners. We also co-developed questions for a learning agenda and recommended adaptive management tasks such as quarterly pause and reflect sessions so that leadership and program managers could make necessary adaptations to the program based on performance data. The consortium was also tasked with developing the performance management information system.
Findings: Through this experience, we found that the theory of change can serve as a key tool to integrate MERL with program design and it can form the foundation on which to build remaining MERL activities. Additionally, indeed, MERL can be compromised by an immature program design that has been informed by an incomplete needs assessment. For all key takeaways from this experience of applying the approach and principles as well as action items for program and MERL practitioners and key questions for leadership, please see the following case study.
All in all, it was an engaging session and we heard good questions and comments from our audience. To learn more or if you have any questions on the approach, feel free to email us at email@example.com
This publication was produced by William Davidson Institute at the University of Michigan (WDI) in collaboration with World Vision (WV) under the BalanceD-MERL Program, Cooperative Agreement Number AID-OAA-A-15-00061, funded by the U.S. Agency for International Development (USAID). This study/ report/ audio/ visual/other information/ media product (specify) is made possible by the generous support of the American people through the USAID. The contents are the responsibility of the William Davidson Institute and World Vision and do not necessarily reflect the views of USAID or the United States Government.
by Alexis Smart, Senior Technical Officer, and Alexis Banks, Technical Officer, at Root Change
As part of their session at MERL Tech DC 2018, Root Change launched Pando, an online platform that makes it possible to visualize, learn from, and engage with the systems where you work. Pando harnesses the power of network maps and feedback surveys to help organizations strengthen systems and improve their impact.
Decades of experience in the field of international development has taught our team that trust and relationships are at the heart of social change. Our research shows that achieving and sustaining development outcomes depends on the contributions of multiple actors embedded in thick webs of social relationships and interactions. However, traditional MERL approaches have failed to help us understand the complex dynamics within those relationships. Pando was created to enable organizations to measure trust, relationships, and accountability between development actors.
Relationship Management & Network Maps
Grounded in social network analysis, Pando uses web-based relationship surveys to identify diverse organizations within a system and track relationships in real time. The platform automatically-generates a network map that visualizes the organizations and relationships within asystem. Data filters and analysis tools help uncover key actors, areas ofcollaboration, and network structures and dynamics.
Feedback Surveys & Analysis
Pando is integrated with Keystone Accountability’s Feedback Commons, an online tool that gives map administrators the ability to collect and analyze feedback about levels of trust and relationship quality among map participants. The combined power of network maps and feedback surveys helps create a holistic understanding of the system of organizations that impact a social issue, facilitate dialogue, and track change over time as actors work together to strengthen the system.
Evaluating Local Humanitarian ResponseSystems: We worked with the Harvard Humanitarian Institute (HHI) to evaluate the effect of local capacity development efforts on local ownership within humanitarian response networks in the Philippines, Kenya, Myanmar, and Ethiopia. Using social network analysis, Root Change and HHI assessed the roles of local and international organizations within each network to determine thedegree to which each system was locally-led.
Supporting Collective Impact in Nigeria: Network mapping has also been used in the USAID funded Strengthening Advocacy and Civic Engagement (SACE) project in Nigeria. Over five years, more than 1,300 organizationsand 2,000 relationships across 17 advocacy issue areas were identified andtracked. Nigerian organizations used the map to form meaningful partnerships,set common agendas, coordinate strategies, and hold the government accountable.
Informing Project Design in Kenya – Root Change and the Aga Khan Foundation (AKF) collected relationship data from hundreds of youth and organizations supporting youth opportunities in coastal Kenya. Analysis revealed gaps in expertise within the system, and opportunities to improve relationships among organizations and youth. These insights helped inform AKF’s program design, and ongoing mapping will be used to monitor system change.
Tracking Local Ownership: This year, under USAID Local Works, Root Change is working with USAID missions to measure local ownership of development initiatives using newly designed localization metrics on Pando. USAID Bosnia and Herzegovina (BiH) launched a national Local Works map, identifying over 1,000 organizations working together on community development. Root Change and USAID BiH are exploring a pilot to use this map to continue to collect data and track localization metrics and train a local organization to support with this process.
Join the MERL Tech DC
As part of the MERL Tech DC 2018 conference, Root Change launched a map of the MERL Tech community. Event participants were invited to join this collaborative mapping effort to identify and visualize the relationships between organizations working to design, fund, and implement technology that supports monitoring, evaluation, research, and learning (MERL) efforts in development.
It’s not too late to join! Email firstname.lastname@example.org for an invitation to join the MERL Tech DC map and a chance to explore Pando.
Learn more about Pando
Pando is the culmination of more than a decade of experience providing training and coaching on the use of social network analysis and feedback surveys to design, monitor, and evaluate systems change initiatives. Initial feedback from international and local NGOs, governments, community-based organizations, and more is promising. But don’t take our word for it. We want to hear from you about ways that Pando could be useful in your social impact work. Contact us to discuss ways Pando could be applied in your programs.
We attended the MERL Tech DC 2018 conference held on Sept. 7, 2018 and led a session related to the creation of a learning agenda to help MERL practitioners gauge the value of blockchain technology for development programming.
As a trio of monitoring, evaluation, research, and learning, (MERL) practitioners in international development, we are keenly aware of the quickly growing interest in blockchain technology. Blockchain is a type of distributed database that creates a nearly unalterable record of cryptographically secure peer-to-peer transactions without a central, trusted administrator. While it was originally designed for digital financial transactions, it is also being applied to a wide variety of interventions, including land registries, humanitarian aid disbursement in refugee camps, and evidence-driven education subsidies. International development actors, including government agencies, multilateral organizations, and think tanks, are looking at blockchain to improve effectiveness or efficiency in their work.
Naturally, as MERL practitioners, we wanted to learn more. Could this radically transparent, shared database managed by its users, have important benefits for data collection, management, and use? As MERL practice evolves to better suit adaptive management, what role might blockchain play? For example, one inherent feature of blockchain is the unbreakable and traceable linkages between blocks of data. How might such a feature improve the efficiency or effectiveness of data collection, management, and use? What are the advantages of blockchain over other more commonly used technologies? To guide our learning we started with an inquiry designed to help us determine if, and to what degree, the various features of blockchain add value to the practice of MERL. With our agenda established, we set out eagerly to find a blockchain case study to examine, with the goal of presenting our findings at the September 2018 MERL Tech DC conference.
What we did
We documented 43 blockchain use-cases through internet searches, most of which were described with glowing claims like “operational costs… reduced up to 90%,” or with the assurance of “accurate and secure data capture and storage.” We found a proliferation of press releases, white papers, and persuasively written articles. However, we found no documentation or evidence of the results blockchain was purported to have achieved in these claims. We also did not find lessons learned or practical insights, as are available for other technologies in development.
We fared no better when we reached out directly to several blockchain firms, via email, phone, and in person. Not one was willing to share data on program results, MERL processes, or adaptive management for potential scale-up. Despite all the hype about how blockchain will bring unheralded transparency to processes and operations in low-trust environments, the industry is itself opaque. From this, we determined the lack of evidence supporting value claims of blockchain in the international development space is a critical gap for potential adopters.
What we learned
Blockchain firms supporting development pilots are not practicing what they preach — improving transparency — by sharing data and lessons learned about what is working, what isn’t working, and why. There are many generic decision trees and sales pitches available to convince development practitioners of the value blockchain will add to their work. But, there is a lack of detailed data about what happens when development interventions use blockchain technology.
Since the function of MERL is to bridge knowledge gaps and help decision-makers take action informed by evidence, we decided to explore the crucial questions MERL practitioners may ask before determining whether blockchain will add value to data collection, management, and use. More specifically, rather than a go/no-go decision tool, we propose using a learning agenda to probe the role of blockchain in data collection, data management and data use at each stage of project implementation.
“Before you embark on that shiny blockchain project, you need to have a very clear idea of why you are using a blockchain.”
Typically, “A learning agenda is a set of questions, assembled by an organization or team, that identifies what needs to be learned before a project can be planned and implemented.” The process of developing and finding answers to learning questions is most useful when it’s employed continuously throughout the duration of project implementation, so that changes can be made based on what is learned about changes in the project’s context, and to support the process of applying evidence to decision-making in adaptive management.
We explored various learning agenda questions for data collection, management and use that should continue to be developed and answered throughout the project cycle. However, because the content of a learning agenda is highly context-dependent, we focused on general themes. Examples of questions that might be asked by beneficiaries, implementing partners, donors, and host-country governments, include:
What could each of a project’s stakeholder groups gain from the use of blockchain across the stages of design and implementation, and, would the benefits of blockchain incentivize them to participate?
Can blockchain resolve trust or transparency issues between disparate stakeholder groups, e.g. to ensure that data reported represent reality, or that they are of sufficient quality for decision-making?
Are there less-expensive, more appropriate, or easier to execute, existing technologies that already meet each group’s MERL needs?
Are there unaddressed MERL management needs blockchain could help address, or capabilities blockchain offers that might inspire new and innovative thinking about what is done, and how it gets done?
This approach resonated with other MERL for development practitioners
We presented this approach to a diverse group of professionals at MERL Tech DC, including other MERL practitioners and IT support professionals, representing organizations from multilateral development banks to US-based NGOs. Facilitated as a participatory roundtable, the session participants discussed how MERL professionals could use learning agendas to help their organizations both decide whether blockchain is appropriate for intervention design, as well as guide learning during implementation to strengthen adaptive management.
Questions and issues raised by the session participants ranged widely, from how blockchain works, to expressing doubt that organizational leaders would have the risk appetite required to pilot blockchain when time and costs (financial and human resource) were unknown. Session participants demonstrated an intense interest in this topic and our approach. Our session ran over time and side conversations continued into the corridors long after the session had ended.
Our approach, as it turns out, echoes others in the field who question whether the benefits of blockchain add value above and beyond existing technologies, or accrue to stakeholders beyond the donors that fund them. This trio of practitioners will continue to explore ways MERL professionals can help their teams learn about the benefits of blockchain technology for international development. But, in the end, it may turn out that the real value of blockchain wasn’t the application of the technology itself, but rather as an impetus to question what we do, why we do it, and how we could do it better.