by Rachel Dickinson, Technical Officer for Research and Learning, Root Change
“Localization”, measuring local ownership, USAID’s Journey to Self-Reliance… We’re all talking about these ideas and policies, and trying to figure out how to incorporate them in our global development projects, but how do we know if we are making progress on these goals? What do we need to measure?
Root Change and Keystone Accountability, under a recent USAID Local Works research grant, created the Pando Localization Learning System (LLS) as both a tool and a methodology for measuring and tracking local ownership within projects in real time. Pando LLS is an online platform that uses network maps and feedback surveys to assess system health, power dynamics, and collaboration within a local development system. It gives development practitioners simple, easy-to-use visuals and indicators, which can be shared with stakeholders and used to identify opportunities for strengthening local development systems.
We launched the Pando platform at MERL Tech DC in 2018, and this year we wanted to share (and get reactions to) a new set of localization measures and a reflective approach we have embedded in the tool.
Analysis of local ownership on Pando LLS is organized around four key measures. Under each we have determined a series of indicators pulling from both social network analysis (SNA) and feedback survey questions. For those interested in geeking out on the indicators themselves, visit our White Paper on the Pando Localization Learning System (LLS), but the four measures are:
1) Leadership measures whether local actors can voice concerns, set priorities and define success in our projects. It measures whether we, as outsiders, are soliciting input from local actors. In other words, it looks at whether project design and implementation is bottom-up.
2) Mutuality measures whether strong reciprocal, or two-way, relationships exist. It measures whether we, as external actors, respond to and act on feedback from local actors. It’s the respect and trust required for success in any interaction.
3) Connectivity measures whether the local system motivates and incentivizes local actors to work together to solve problems. It measures whether we, as program implementers, promote collaboration and connection between local actors. It asks whether the local system is actually improving, and if we are playing the right roles.
4) Financing measures whether dependency on external financial resources is decreasing, and local financial opportunities are becoming stronger. It measures whether we, as outsiders, are preparing local organizations to be more resilient and adaptive. It explores the timeless question of money and resources.
Did you notice how each of these measures assesses not only local actors and their system, but also our role as outsiders? This takes us to the reflective approach.
The Pando LLS approach emphasizes dialogue with system actors and self-reflection by development practitioners. It pushes us to question our assumptions about the systems where we work and tasks us with developing project activities and M&E plans that involve local actors. The theories behind the approach can also be found in our White Paper, but here are the basic steps:
Listen to local actors by inviting them to map their relationships, share feedback, and engage in dialogue about the results;
Co-create solutions and learn through short-term experiments that aim to improve relationships and strengthen the local system;
Incorporate what’s working back into development projects and celebrate failures as progress; and
Repeat the listen, reflect, and adapt cycles 3-4 times a year to ensure each one is small and manageable.
What do you think of this method for measuring and promoting local ownership? Do we have the measures right? How are you measuring local ownership in your work? Would you be interested in testing the Pando LLS approach together? We’d love to hear from you! Email me at email@example.com to share your feedback, questions, or ideas!
By Alexis Banks, Jennifer Himmelstein, and Rachel Dickinson
Social network analysis (SNA) is a powerful tool for understanding the systems of organizations and institutions in which your development work is embedded. It can be used to create interventions that are responsive to local needs and to measure systems change over time. But, what does SNA really look like in practice? In what ways could it be used to improve your work? Those are the questions we tackled in our recent MERL Tech session, Visualizing Your Network for Adaptive Program Decision Making. ACDI/VOCA and Root Change teamed up to introduce SNA, highlight examples from our work, and share some basic questions to help you get started with this approach.
SNA is the process of mapping and measuring relationships and information flows between people, groups, organizations, and more. Using key SNA metrics enables us to answer important questions about the systems where we work. Common SNA metrics include (learn more here):
Reachability, which helps us determine if one actor, perhaps a local NGO, can access another actor, such as a local government;
Distance, which is used to determine how many steps, or relationships, there are separating two actors;
Degree centrality, which is used to understand the role that a single actor, such as an international NGO, plays in a system by looking at the number of connections with that organization;
Betweenness, which enables us to identify brokers or “bridges” within networks by identifying actors that lie on the shortest path between others; and
Change Over Time, which allows us to see how organizations and relationships within a system have evolved.
SNA in the Program Cycle
SNA can be used throughout the design, implementation, and evaluation phases of the program cycle.
Design: Teams at Root Change and ACDI/VOCA use SNA in the design phase of a program to identify initial partners and develop an early understanding of a system–how organizations do or do not work together, what barriers are preventing collaboration, and what strategies can be used to strengthen the system.
As part of the USAID Local Works program, Root Change worked with the USAID mission in Bosnia and Herzegovina (BiH) to launch a participatory network map that identified over 1,000 organizations working in community development in BiH, many of which had been previously unknown to the mission. It also provided the foundation for a dialogue with system actors about the challenges facing BiH civil society.
To inform project design, ACDI/VOCA’s Feed the Future Tanzania NAFAKA II Activity, funded by USAID conducted a network analysis to understand the networks associated with village based agricultural advisors (VBAAs)–what services they were offering to farmers already, which had the most linkages to rural actors, which actors were service as bottlenecks, and more. This helped the project identify which VBBA’s to work with through small grants and technical assistance (e.g. key actors), and what additional linkages needed to be built between VBAAs and other types of actors.
Implementation: We also use SNA throughout program implementation to monitor system growth, increase collaboration, and inform learning and program design adaptation. ACDI/VOCA’s USAID/Honduras Transforming Market Systems Activity uses network analysis as a tool to track business relationships created through primary partners. For example, one such primary partner is the Honduran chamber of tourism who facilitates business relationships through group training workshops and other types of technical assistance. They can then follow up on these new relationships to gather data on indirect outcomes (e.g. jobs created, sales and more).
Root Change used SNA throughout implementation of the USAID funded Strengthening Advocacy and Civic Engagement (SACE) program in Nigeria. Over five years, more than 1,300 organizations and 2,000 relationships across 17 advocacy issue areas were identified and tracked. Nigerian organizations came together every six months to update the map and use it to form meaningful partnerships, coordinate advocacy strategies, and hold the government accountable.
Evaluating Impact: Finally, our organizations use SNA to measure results at the mid-term or end of project implementation. In Kenya, Root Change developed the capacity of Aga Khan Foundation (AKF) staff to carry out a baseline, and later an end-of-project network analysis of the relationships between youth and organizations providing employment, education, and entrepreneurship support. The latter analysis enabled AKF to evaluate growth in the network and the extent to which gaps identified in the baseline had been addressed.
The Feed The Future Ghana Agricultural Development and Value Chain Enhancement II (ADVACNE II) Project, implemented by ACDI/VOCA and funded by USAID, leveraged existing database data to demonstrate the outgrower business networks that were established as a result of the project. This was an important way of demonstrating one of ADVANCE II’s major outcomes–creating a network of private service providers that serve as resources for inputs, financing, and training, as well as hubs for aggregating crops for sales.
Approaches to SNA
There are a plethora of tools to help you incorporate SNA in your work. These range from bespoke software custom built for each organization, to free, open source applications.
Root Change uses Pando, a web-based, participatory tool that uses relationship surveys to generate real-time network maps that use basic SNA metrics. ACDI/VOCA, on the other hand, uses unique identifiers for individuals and organizations in its routine monitoring and evaluation processes to track relational information for these actors (e.g. cascaded trainings, financing given, farmers’ sales to a buyer, etc.) and an in-house SNA tool.
Applying SNA to Your Work
What do you think? We hope we’ve piqued your interest! Using the examples above, take some time to consider ways that SNA could be embedded into your work at the design, implementation, or evaluation stage of your work using this worksheet. If you get stuck, feel free to reach out (Alexis Banks, firstname.lastname@example.org; Rachel Dickinson, email@example.com; Jennifer Himmelstein, JHimmelstein@acdivoca.org)!
The blog post inspired a barrage of unanticipated discussion online. Unfortunately, in some cases readers (and re-posters) misinterpreted the point as disparaging of blockchain. Rather, the post authors were simply asserting ways to cope with uncertain situations related to piloting blockchain projects. Perhaps the most important outcome of the session and post, however, is that they motivated a coordinated response from several organizations who wanted to delve deeper into the blockchain learning agenda.
To do that, on March 5, 2019, Chemonics, Truepic, and Consensys hosted a roundtable titled “How to Successfully Apply Blockchain in International Development.” All three organizations are applying blockchain in different and complementary ways relevant to international development — including project monitoring, evaluation, learning (MEL) innovations as well as back-end business systems. The roundtable enabled an open dialogue about how blockchain is being tested and leveraged to achieve better international development outcomes. The aim was to explore and engage with real case studies of blockchain in development and share lessons learned within a community of development practitioners in order to reduce the level of opacity surrounding this innovative and rapidly evolving technology.
Three case studies were highlighted:
1. “One-click Biodata Solution” by Chemonics
Chemonics’ Blockchain for Development Solutions Lab designed and implemented a RegTech solution for the USAID foreign assistance and contracting space that sought to leverage the blockchain-based identity platform created by BanQu to dramatically expedite and streamline the collection and verification of USAID biographical data sheets (biodatas), improve personal data protection, and reduce incidents of error and fraud in the hiring process for professionals and consultants hired under USAID contracts.
Chemonics processes several thousand biodatas per year and accordingly devotes significant labor effort and cost to support the current paper-based workflow.
Chemonics’ technology partner, BanQu, used a private, permissioned blockchain on the Ethereum network to pilot a biodata solution.
Chemonics successfully piloted the solution with BanQu, resulting in 8 blockchain-based biodatas being fully processed in compliance with donor requirements.
Improved data protection was a priority for the pilot. One goal of the solution was to make it possible for individuals to maintain control over their back-up documentation, like passports, diplomas, and salary information, which could be shared temporarily with Chemonics through the use of an encrypted key, rather than having documentation emailed and saved to less secure corporate digital file systems.
Following the pilot, Chemonics determined through qualitative feedback that users across the biodata ecosystem found the blockchain solution to be easy to use and succeeded at reducing level of effort on the biodata completion process.
Chemonics also compiled lessons-learned, including refinements to the technical requirements, options to scale the solution, and additional user feedback and concerns about the technology to inform decision-making around further biodata pilots.
2. Project i2i presented by Consensys
Problem Statement: 35% of the Filipino population is unbanked, and 56% lives in rural areas. The Philippines economy relies heavily on domestic remittances. Unionbank sought to partner with hundreds of rural banks that didn’t have access to electronic banking services that the larger commercial banks do.
In 2017, to continue the Central Bank of the Philippines’ national strategy for financial inclusion, the central banks of Singapore and the Philippines announced that they would collaborate on financial technology by employing the regulatory sandbox approach. This will provide industry stakeholders with the room and time to experiment before regulators enact potentially restrictive policies that could stifle innovation and growth. As part of the agreement, the central banks will share resources, best practices, research, and collaborate to “elevate financial innovation” in both economies.
Solution design assumptions for Philippines context:
It can be easily operated and implemented with limited integration, even in low-tech settings;
It enables lower transaction time and lower transaction cost;
It enables more efficient operations for rural banks, including reduction of reconciliations and simplification of accounting processes.
Unionbank worked with ConsenSys and participating rural banks to create an interbank ledger with tokenization. The payment platform is private, Ethereum-based.
In the initial pilot, 20 steps were eliminated in the process.
Technology partners: ConsenSys, Azure (Microsoft), Kaleido, Amazon Web Services.
Truepic is a technology company specializing in digital image and video authentication. Truepic’s Controlled Capture technology uses cutting-edge computer vision, AI, and cryptography technologies to test images and video for signs of manipulation, designating only those that pass its rigorous verification tests are authenticated. Through the public blockchain, Truepic creates an immutable record for each photo and video captured through this process, such that their authenticity can be proven, meeting the highest evidentiary standards. This technology has been used in over 100 countries by citizen journalists, activists, international development organizations, NGOs, insurance companies, lenders and online platforms.
One of Truepic’s innovative strategic partners, the UN Capital Development Fund (another participant of the roundtable), has been testing the possibility of using this technology for monitoring and evaluation of development projects. For example, the following Truepic tracks the date, time, and geolocation of the latest progress of a factory in Uganda.
Controlled Capture requires Wifi or at least 3G/4G connectivity to fully authenticate images/video and write them to the public blockchain, which can be a challenge in low connectivity instances, for example in least-developed countries for UNCDF.
As a work around to connectivity issues, Truepic’s partners have used Satellite Internet connections – such as a Thuraya or Iridium device to successfully capture verified images anywhere.
Public blockchain – Truepic is currently using two different public blockchains, testing cost versus time in an effort to continually shorten the time from capture to closing chain of custody (currently around 8-12 seconds).
Cost – The blockchain component is not actually too expensive; the heaviest investment is in the computer vision technology used to authenticate the images/video, for example to detect rebroadcasting, as in taking a picture of a picture to pass off the metadata.
Rights of the image is the owner’s – Truepic does not have rights over the image/video but keeps a copy on its servers in case the user’s phone/tablet is lost, stolen, or broken. And most importantly, so that Truepic can produce the original image on its verification page when shared or disseminated publicly.
Court + evidentiary value: the technology and public-facing verification pages are designed to meet the highest evidentiary standards.
Tested in courts; currently being testing at the international level but cannot disclose specifics due to confidentiality reasons.
Privacy and security are key priorities, especially for working in conflict zones, such as Syria. Truepic does not use 2-step authentication because the technology is focused on authenticating the images/video; it is not relevant who the source is and this way it keeps the source as anonymous as possible. Truepic works with its partners to educate on best practices to maintain high levels of anonymity in any scenario.
Biggest challenge is usage by implementing partners – it is very easy to use, however the behavioral change to use the platform has been challenging.
Other challenge: you bring the solution to an implementer, and the implementer says you have to get the donor to integrate it into their RFP scopes; then the donors recommend that we speak to implementing partners.
Storage capacity issues? Storage is not currently a problem; Truepic has plans in place to address any storage issues that may arise with scale.
How did implementers measure success in their blockchain pilots?
Measurement was both quantitative and qualitative
The organizations worked with clients to ensure people who needed the MEL were able to access and use it
Concerns with publicizing information or difficulties with NDAs were handled on a case-by-case basis
The original search for evidence on the impact of blockchain sought a level of data fidelity that is difficult to capture and validate, even under the least challenging circumstances. Not finding it at that time, the research team sought the next best solution, which was not to discount the technology, but to suggest ways to cope with the knowledge gaps they encountered by recommending a learning agenda. The roundtable helped to stimulate robust conversation of the three case studies, contributing to that learning agenda.
Most importantly, the experience highlighted several interesting takeaways about innovation in public-private partnerships more broadly:
The initial MERL Tech session publicly and transparently drew attention to the gaps that were identified from the researchers’ thirty thousand-foot view of evaluating innovation.
This transparency drew out engagement and collaboration between and amongst those best-positioned to move quickly and calibrate effectively with the government’s needs: the private sector.
This small discussion that focused on the utility and promise of blockchain highlighted the broader role of government (as funder/buyer/donor) in both providing the problem statement and anchoring the non-governmental, private sector, and civil society’s strengths and capabilities.
One year later…
So, a year after the much-debated blockchain blogpost, what has changed? A lot. There is a growing body of reporting that adds to the lessons learned literature and practical insights from projects that were powered or supported by blockchain technology. The question remains: do we have any greater documentation or evidence of the results blockchain was purported to have achieved in these claims? It seems that while reporting has improved, it still has a long way to go.
It’s worth pointing out that the international development industry, with far more experts and funding dedicated to working on improving MERL than emerging tech companies, also has some distance to go in meeting its own evidence standards. Fortunately, the volume and frequency of hype seems to have decreased (or perhaps the news cycle has simply moved on?), thereby leaving blockchain (and its investors and developers) the space they need to refine the technology.
In closing, we, like the co-authors of the 2018 post, remain optimistic that blockchain, a still emerging technology, will be given the time and space needed to mature and prove its potential. And, whether you believe in “crypto-winter” or not, hopefully the lull in the hype cycle will prove to be the breathing space that blockchain needs to keep evolving in a productive direction.
Shailee Adinolfi: Shailee works on Public Sector solutions at ConsenSys, a global blockchain technology company building the infrastructure, applications, and practices that enable a decentralized world. She has 20 years of experience at the intersection of technology, financial inclusion, trade, and government, including 11 years on USAID funded projects in Africa, Asia and the Middle East.
John Burg: John was a co-author on the original MERL Tech DC 2018 blog, referenced in this blog. He is an international development professional with almost 20 years of cross-sectoral experience across 17 countries in six global regions. He enjoys following the impact of emerging technology in international development contexts.
Tara Vassefi: Tara is Truepic’s Washington Director of Strategic Initiatives. Her background is as a human rights lawyer where she worked on optimizing the use of digital evidence and understanding how the latest technologies are used and weighed in courts around the world.
Moving from hype to practice is an important but challenging step for ICT4D practitioners. As the technical adviser for digital development at IREX, a global development and education organization, I’ve been watching with cautious optimism as international development stakeholders begin to explore how artificial intelligence tools like machine learning can help them address problems and introduce efficiencies to amplify their impact.
So while USAID was developing theirguide to making machine learning work for international development and TechChange rolled out theirnew courseon Artificial Intelligence for International Development, we spent a few months this summer exploring whether we could put machine learning to work to measure media quality.
Of course, we didn’t turn to machine learning just for the sake of contributing to the “breathless commentary of ML proponents” (as USAID aptly puts it).
As we shared in asessionwith our artificial intelligence partnerLoreatMERLTech DC 2018, some of our programs face a very real set of problems that could be alleviated through smarter use of digital tools.
Our Machine Learning Experiment
In our USAID-funded Media Strengthening Program in Mozambique, for example, a small team of human evaluators manually score thousands of news articles based on18 measures of media quality.
This process is time consuming (some evaluators spend up to four hours a day reading and evaluating articles), inefficient (when staff turns over, we need to reinvest resources to train up new hires), and inconsistent (even well-trained evaluators might score articles differently).
To test whether we can make the process of measuring media quality less resource-intensive, wespent a few monthstraining software to automatically detect one of these 18 measures of media quality: whether journalists keep their own opinions out of their news articles. The results of this experiment are very compelling:
The software had 95% accuracyin recognizing sentences containing opinions within the dataset of 1,200 articles.
The software’s ability to “learn” was evident. Anecdotally, the evaluation team noticed a marked improvement in the accuracy of the software’s suggestions after showing it only twenty sentences that had opinions. The accuracy, precision, and recall results highlighted above were achieved after only sixteen rounds of training the software.
Accuracy and precision increasedthe more that the model was trained. There is a clear relationship between the number of times the evaluators trained the software and the accuracy and precision of the results. The recall results did not improve over time as consistently.
What does this all mean? Let’s start with the good news. The results suggest that some parts of media quality—specifically, whether an article is impartial or whether it echoes its author’s opinions—can be automatically measured by machine learning.
The software also introduces the possibility of unprecedented scale, scanning thousands of articles in seconds for this specific indicator. These implications introduce ways for media support programs to spend their limited resources more efficiently.
3 Lessons Learned from using Machine Learning
Of course, the machine learning experience was not without problems. With any cutting-edge technology, there will be lessons we can learn and share to improve everyone’s experience. Here are our three lessons learned working with machine learning:
1. Forget about being tech-literate; we need to be more problem-literate.
Defining a coherent, specific, actionable problem statement was one of the important steps of this experiment. This wasn’t easy. Hard trade-offs had to be made (Which of 18 indicators should we focus on?), and we had to focus on things we could measure in order to demonstrate efficiency games using this new approach (How much time do evaluators currently spend scoring articles?).
When planning your own machine learning project, devote plenty of time at the outset—together with your technology partner—to define the specific problem you’ll try to address. These conversations result in a deeper shared understanding of both the sector and the technology that will make the experiment more successful.
2. Take the time to communicate results effectively.
Since completing the experiment, people have asked me to explain how “accurate” the software is. But in practice, machine learning software uses different methods to define “accuracy”, which in turn can vary according to the specific model (the software we used deploys several models).
What starts off as a simple question (How accurate is our software?) can easily turn into a discussion of related concepts like precision, recall, false positives, and false negatives. We found that producing clean visuals (like this or this) became the most effective way to explain our results.
3. Start small and manage expectations.
Stakeholders with even a passing awareness of machine learning will be aware of its hype. Even now, some colleagues ask me how we “automated the entire media quality assessment process”—even though we only used machine learning to identify one of 18 indicators of media quality. To help mitigate inflated expectations, we invested a small amount into this “minimum viable product” (MVP) to prove the fundamental concept before expanding on it later.
Approaching your first machine learning project this way might help to keep expectations in line with reality, minimize risks associated with experimentation, and provide air cover for you to adjust your scope as you discover limitations or adjacent opportunities during the process.
Our team wanted to evaluate our impact, so we applied a new framework to find answers.
What We Tested
Every social organization, GlobalGiving included, needs to know if it’s having an impact on the communities it serves. For us, that means understanding the ways in which we are (or aren’t!) helping our nonprofit partners around the world improve their own effectiveness and capacity to create change, regardless of the type of work they do.
Why It Matters
Without this knowledge, social organizations can’t make informed decisions about the strategies to use to deliver their services. Unfortunately, this kind of rigorous impact evaluation is usually quite expensive and can take years to carry out. As a result, most organizations struggle to evaluate their impact.
We knew the challenges going into our own impact research would be substantial, but it was too important for us not to try.
The Big Question
Do organizations with access to GlobalGiving’s services improve their performance differently than organizations that don’t? Are there particular focus areas where GlobalGiving is having more of an impact than others?
Ideally, we’d randomly assign certain organizations to receive the “treatment” of being part of GlobalGiving and then compare their performance with another randomly assigned control group. But, we can’t just tell random organizations that they aren’t allowed to be part of our community. So, instead we compared a treatment group—organizations that have completed the GlobalGiving vetting process and become full partners on the website—with a control group of organizations that have successfully passed the vetting process but haven’t joined the web community. Since we can’t choose these groups randomly, we had to ensure the organizations in each group are as similar as possible so that our results aren’t biased by underlying differences between the control and treatment groups.
To do this, we worked only with organizations based in India. We chose India because we have lots of relationships with organizations there, and we needed as large a sample size as possible to increase confidence that our conclusions are reliable. India is also well-suited for this study because it requires organizations to have special permission to receive funds from overseas under the Foreign Contribution Regulation Act (FCRA). Organizations must have strong operations in place to earn this permission. The fact that all participant organizations are established enough to earn both an FCRA certification and pass GlobalGiving’s own high vetting standards means that any differences in our results are unlikely to be caused by geographic or quality differences.
We also needed a way to measure nonprofit performance in a concrete way. For this, we used the “Organizational Performance Index” (OPI) framework created by Pact. The OPI provides a structured way to understand a nonprofit’s capacity along eight different categories, including its ability to deliver programs, the diversity of its funding sources, and its use of community feedback. The OPI scores organizations on a scale of 1 (lowest) to 4 (highest). With the help of a fantastic team of volunteers in India, we gathered two years of OPI data from both the treatment and control groups, then compared how their scores changed over time to get an initial indicator of GlobalGiving’s impact.
The most notable result we found was that organizations that were part of GlobalGiving demonstrated significantly more participatory planning and decision-making processes (what we call “community leadership”), and improved their use of stakeholder feedback to inform their work, in comparison to control group organizations. We did not see a similar significant result in the other seven categories that the OPI tracks. The easiest way to see this result is to visualize how organizations’ scores shifted over time. The chart below shows differences in target population scores—Pact’s wording for “community leadership and feedback.”
Differences in Target Population Score Changes
For example, look at the organizations that started out with a score of two in the control group on the left. Roughly one third of those increased their score to three, one third stayed the same, and one third had their scores drop to one. In contrast, in the treatment group on the right, nearly half the organizations increased their scores and about half stayed the same, while only a tiny fraction dropped. You can see a similar pattern across the two groups regardless of their starting score.
In contrast, here’s the same diagram for another OPI category where we didn’t see a statistically significant difference between the two groups. There’s not nearly as clear a pattern—both the treatment and control organizations change their scores about the same amount.
Differences in Delivery Score Changes
For more technical details about our research design process, our statistical methodology, and the conclusions we’ve drawn, please check out the full write-up of this work, which is available on the Social Science Research Network.
Our initial finding—that our emphasis on feedback is having a measurable impact—is an encouraging sign.
On the other hand, we didn’t see that GlobalGiving was driving significant changes in any of the other seven OPI categories. Some of these categories, like adherence to national or international standards, aren’t areas where GlobalGiving focuses much. Others, like how well an organization learns over time, are closely related to what we do (Listen, Act, Learn. Repeat. is one of our core values). We’ll need to continue to explore why we’re not seeing results in these areas and, if necessary, make adjustments to our programs accordingly.
Make It Yours
Putting together an impact study, even a smaller one like this, is a major undertaking for any organization. Many organizations talk about applying a more scientific approach to their impact, but few nonprofits or funders take on the challenge of carrying out the research needed to do so. This study demonstrates how organizations can make meaningful progress towards rigorously measuring impact, even without a decade of work and an eight-figure budget.
If your organization is considering something similar, here are a few suggestions to keep in mind that we’ve learned as a result of this project:
1. If you can’t randomize, make sure you consider possible biases.
Logistics, processes, and ethics are all reasons why an organization might not be able to randomly assign treatment groups. If that’s the case for you, think carefully about the rest of your design and how you’ll reduce the chance that a result you see can be attributed to a different cause.
2. Choose a measurement framework that aligns with your theory of change and is precise as possible.
We used the OPI because it was easy to understand, reliable, and well-accepted in the development sector. But, the OPI’s four-level scale made it difficult to make precise distinctions between organizations, and there were some categories that didn’t make sense in the context of how GlobalGiving works. These are areas we’ll look to improve in future versions of this work.
3. Get on the record.
Creating a clear record of your study, both inside and outside your organization, is critical for avoiding “scope creep.” We used Git to keep track of all changes in our data, code, and written analysis, and shared our initial study design at the 2017 American Evaluation Association conference.
4. Enlist outside help.
This study would not have been possible without lots of extra help, from our volunteer team in India, to our friends at Pact, to the economists and data scientists who checked our math, particularly Alex Hughes at UC Berkeley and Ted Dunmire at Booz Allen Hamilton.
We’re pleased about what we’ve learned about GlobalGiving’s impact, where we can improve, and how we might build on this initial work, and we can’t wait to continue to build on this progress moving forward in service of improved outcomes for our nonprofit partners worldwide.
I had the opportunity during MERL Tech London 2018 to attend a very interesting session to discuss blockchains and how can they be applied in the MERL space. This session was led by Valentine Gandhi, Founder of The Development CAFÉ, Zara Rahman, Research and Team Lead at the The Engine Room, and Wayan Vota, Co-founder of Kurante.
The first part of the session was an introduction to blockchain, which is basically an distributed ledger system. Why is it an interesting solution? Because the geographically distributed traces left in multiple devices make for a very robust and secure system. It is not possible to take a unilateral decision to scrap or eliminate data because it would be reflected in the distributed constitution of the data chain. Is it possible to corrupt the system? Well, yes, but what makes it robust and secure is that for that to happen, every single person participating in the blockchain system must agree to do so.
That is the powerful innovation of the technology. It remains somehow to the torrents of technology to share files: it is very hard to control this when your file storage is not in a single server but rather in an enormous number of end-user terminals.
What I want to share from this session, however, is not how the technology works! That information is readily available on the Internet and other sources.
What I really found interesting was the part of the session where professionals interested in blockchain shared our doubts and the questions that we would need to clarify in order to decide whether blockchain technology would be required or not.
Some of the most interesting shared doubts and concerns around this technology were:
What sources of training and other useful resources are available if you want to implement blockchain?
Say the organization or leadership team decides that a blockchain is required for the solution. I am pretty sure it is not hard to find information about blockchain on the Internet, but we all face the same problem — the enormous amount of information available makes it tricky to reach the holy grail that provides just enough information without losing hours to desktop research. It would be incredibly beneficial to have a suggested place where this info can be find, even more if it were a specialized guide aimed at the MERL space.
What are the data space constraints?
I found this question very important. It is a key aspect of the design and scalability of the solution. I assume that it will not be an important amount of data but I really don’t know. And maybe it is not a significant amount of information for a desktop or a laptop, but what if we are using cell phones as end terminals too? This need to be addressed so the design is based on facts and not assumptions.
Again, there are probably a lot of them to be found all over the Internet, but they are hardly going to be insightful for a specific MERL approach. Is it possible to have a repository of relevant cases for the MERL space?
When is blockchain really required?
It would be really helpful to have a simple guide that helps any professional clarify whether the volume or importance of the information is worth the implementation of a Blockchain system or not.
Is there a right to be forgotten in Blockchain?
Recent events give a special relevance to this question. Blockchains are very powerful to achieve traceability, but what if I want my information to be eliminated because it is simply my right? This is an important aspect in technologies that have a distributed logic. How to use the powerful advantages of blockchain while allocating the individual rights of every single person to take unilateral decisions on their private or personal information?
I am not an expert in the matter but I do recognize the importance of these questions and the hope is that the people able to address them can pick them up and provide useful answers and guidance to clarify some or all of them.
If you have answers to these questions, or more questions about blockchain and MERL, please add them in the comments!
We’ve been working hard over the past several weeks to finish up the agenda for MERL Tech London 2018, and it’s now ready!
We’ve got workshops, panels, discussions, case studies, lightning talks, demos, community building, socializing, and an evening reception with a Fail Fest!
Topics range from mobile data collection, to organizational capacity, to learning and good practice for information systems, to data science approaches, to qualitative methods using mobile ethnography and video, to biometrics and blockchain, to data ethics and privacy and more.
You can search the agenda to find the topics, themes and tools that are most interesting, identify sessions that are most relevant to your organization’s size and approach, pick the session methodologies that you prefer (some of us like participatory and some of us like listening), and to learn more about the different speakers and facilitators and their work.
Tickets are going fast, so be sure to snap yours up before it’s too late! (Register here!)
What data superpower would you ask for? How would you describe data to your grandparents? What’s the worst use of data you’ve come across?
These are a few of the questions that TechChange’s DataDay TV Show tackles in its latest episode.
The DataDay Team (Nick Martin, Samhir Vasdev, and Priyanka Pathak) traveled to MERL Tech DC last September to ask attendees some tough data-related questions. They came away with insightful, unusual, and occasionally funny answers….
If you’re a fan of discussing data, technology and MERL, join us at MERL Tech London on March 19th and 20th.
If you want to take your learning to the next level with a full-blown course, TechChange has a great 2018 schedule, including topics like blockchain, AI, digital health, data visualization, e-learning, and more. Check out their course catalog here.
What about you, what data superpower would you ask for?
You want to take your M&E system one step further and introduce a proper M&E software? That’s great, because a software has the potential of making the monitoring process more efficient and transparent, reducing errors and getting more accurate data. But how to go about it? You have three options:
You hire an IT consultant to set up a customized M&E system according to your organization’s specific requirements.
If options one and two do not work out for you, you can hire consultants to develop a solution for you. You will probably start a public tender to find the most suitable IT company to entrust with this task. While there a lot of things to pay attention to when formulating the Terms of Reference (TOR), I would like to give you some tips specifically about the communication with the hired IT consultants. These insights come from years of experience of being on both sides: The party who wants a tool and needs to describe it to the implementing programmers and being the IT guy (or rather lady) who implements Excel and web-based database tools for M&E.
To be on the safe side, I recommend you to work with this assumption: IT consultants have no clue about M&E. There are few IT companies who come from the development sector, like energypedia consult does, and are familiar with M&E concepts such as indicators, logframes and impact chains. To still get what you need, you should pay attention to the following communication tips:
Take your time explaining what you need: Writing TOR takes time – but it takes even longer and becomes more costly when you hire somebody for something that is not thought through. If you don’t know all the details right from the start, get some expert assistance in formulating terms – it’s worthwhile.
Use graphs: Instead of using words to describe your monitoring logic and the system you need, it is much easier to make graphs to depict the structure, user groups, linking of information, flow of monitoring data etc.
Give examples: When unsure about how to put a feature into words, send a link or a screenshot of the function that you might have come across elsewhere and wish to have in your tool.
Explain concepts and terminology: Many results frameworks work with the terms “input” and “output”. Most IT guys, however, will not have equipment and finished schools in mind, but rather data flows that consist of inputs and outputs. Make sure you clarify this. Also, the term web-based or web monitoring itself is a source of misunderstanding. In the IT world, web monitoring refers to monitoring activity in the internet, for example website visits or monitoring a server. That is probably not what you want when building up an M&E system for e.g. a good governance programme.
Meet in person: In your budget calculation, allow for at least one workshop where you meet in person, for example a kick-off workshop in which you clarify your requirements. This is not only a possibility to ask each other questions, but also to get a feeling of the other party’s language and way of thinking.
Maintain a dialogue: During the implementation phase, make sure to stay in regular touch with the programmers. Ask them to show you updates every once in a while to allow you to give feedback. When you detect that the programmers are heading into the wrong direction, you want to find out rather sooner than later.
Document communication: When we implement web-based systems, we typically create a page within the web platform itself that outlines all the agreed steps. This list serves as a to-do list and an implementation protocol at the same time. It facilitates communication, particularly when on both sides multiple persons are involved that are not always present in all meetings or phone calls.
Be prepared for misunderstandings: They happen. It’s normal. Plan for some buffer days before launching the final tool.
In general, the implementation phase should allow for some flexibility. As both parties learn from each other during the process, you should not be afraid to adjust initial plans, because the final tool will benefit greatly from it (if the contract has some flexibility). Big customized IT projects take some time.
If you need more advice on this matter and some more insights on setting up IT-based M&E systems, please feel free to contact me any time! In the past we supported some clients by setting up a prototype for their web-based M&E system with our flexible WebMo approach. During the prototype process the client learnt a lot and afterwards it was quite easy for other developers to copy the prototype and migrate it to their e.g. Microsoft Share Point environment (in case your IT guys don’t believe in Open Source or don’t want to host third-party software on their server).
Please leave your comments, if you think that I have missed an important communication rule.
In their MERL Tech DC session on qualitative coding, Charles Guedenet and Anne Laesecke from IREX together with Danielle de Garcia of Social Impact offered an introduction to the qualitative coding process followed by a hands-on demonstration on using Excel and Dedoose for coding and analyzing text.
They began by defining content analysis as any effort to make sense of qualitative data that takes a volume of qualitative material and attempts to identify core consistencies and meanings. More concretely, it is a research method that uses a set of procedures to make valid inferences from text. They also shared their thoughts on what makes for a good qualitative coding method.
Their belief is that: it should
consider what is already known about the topic being explored
be logically grounded in this existing knowledge
use existing knowledge as a basis for looking for evidence in the text being analyzed
With this definition laid out, they moved to a discussion about the coding process where they elaborated on four general steps:
develop codes and a codebook
decide on a sampling plan
code your data
go back and do it again!
test for reliability
Developing codes and a codebook is important for establishing consistency in the coding process, especially if there will be multiple coders working on the data. A good way to start developing these codes is to consider what is already known. For example, you can think about literature that exists on the subject you’re studying. Alternatively, you can simply turn to the research questions the project seeks to answer and use them as a guide for creating your codes. Beyond this, it is also useful to go through the content and think about what you notice as you read. Once a codebook is created, it will lend stability and some measure of objectivity to the project.
The next important issue is the question of sampling. When determining sample size, though a larger sample will yield more robust results, one must of course consider the practical constraints of time, cost and effort. Does the benefit of higher quality results justify the additional investment? Fortunately, the type of data will often inform sampling. For example, if there is a huge volume of data, it may be impossible to analyze it all, but it would be prudent to sample at least 30% of it. On the other hand, usually interview and focus group data will all be analyzed, because otherwise the effort of obtaining the data would have gone to waste.
Regarding sampling method, session leads highlighted two strategies that produce sound results. One is systematic random sampling and the other is quota sampling–a method employed to ensure that the proportions of demographic group data are fairly represented.
Once these key decisions have been made, the actual coding can begin. Here, all coders should work from the same codebook and apply the codes to the same unit of analysis. Typical units of analysis are: single words, themes, sentences, paragraphs, and items (such as articles, images, books, or programs). Consistency is essential. A way to test the level of consistency is to have a 10% overlap in the content each coder analyzes and aim for 80% agreement between their coding of that content. If the coders are not applying the same codes to the same units this could either mean that they are not trained properly or that the code book needs to be altered.
Along a similar vein, the fourth step in the coding process is to test for reliability. Challenges in producing stable and consistent results in coding could include: using a unit of analysis that is too large for a simple code to be reliably applied, coding themes or concepts that are ambiguous, and coding nonverbal items. For each of these, the central problem is that the units of analysis leave too much room for subjective interpretation that can introduce bias. Having a detailed codebook can help to mitigate against this.
After giving an overview of the coding process, the session leads suggested a few possible strategies for data visualization. One is to use a word tree, which helps one look at the context in which a word appears. Another is a bubble chart, which is useful if one has descriptive data and demographic information. Thirdly, correlation maps are good for showing what sorts of relationships exist among the data. The leads suggested visiting the website stephanieevergreen.com/blog for more ideas about data visualization.
Finally, the leads covered low-tech and high-tech options for coding. On the low-tech end of the spectrum, paper and pen get the job done. They are useful when there are few data sources to analyze, when the coding is simple, and when there is limited tech literacy among the coders. Next up the scale is Excel, which works when there are few data sources and when the coders are familiar with Excel. Then the session leads closed their presentation with a demonstration of Dedoose, which is a qualitative coding tool with advanced capabilities like the capacity to code audio and video files and specialized visualization tools. In addition to Dedoose, the presenters mentioned Nvivo and Atlas as other available qualitative coding software.
Despite the range of qualitative content available for analysis, there are a few core principles that can help ensure that it is analyzed well, these include consistency and disciplined methodology. And if qualitative coding will be an ongoing part of your organization’s operations, there are several options for specialized software that are available for you to explore. [Click here for links and additional resources from the session.]