Tag Archives: merltech

What a Difference a Year Makes: Contributing to the Blockchain Learning Agenda

by Shailee Adinolfi, John Burg and Tara Vassefi

In September 2018, a three-member team of international development professionals presented a session called “Blockchain Learning Agenda: Practical MERL Workshop” at MERL Tech DC. Following the session, the team published a blog post about the session stating that the authors had “… found no documentation or evidence of the results blockchain was purported to have achieved in these claims [of radical improvements]. [They] also did not find lessons learned or practical insights, as are available for other technologies in development.”

The blog post inspired a barrage of unanticipated discussion online. Unfortunately, in some cases readers (and re-posters) misinterpreted the point as disparaging of blockchain. Rather, the post authors were simply asserting ways to cope with uncertain situations related to piloting blockchain projects. Perhaps the most important outcome of the session and post, however, is that they motivated a coordinated response from several organizations who wanted to delve deeper into the blockchain learning agenda.

To do that, on March 5, 2019, Chemonics, Truepic, and Consensys hosted a roundtable titled “How to Successfully Apply Blockchain in International Development.” All three organizations are applying blockchain in different and complementary ways relevant to international development — including project monitoring, evaluation, learning (MEL) innovations as well as back-end business systems. The roundtable enabled an open dialogue about how blockchain is being tested and leveraged to achieve better international development outcomes. The aim was to explore and engage with real case studies of blockchain in development and share lessons learned within a community of development practitioners in order to reduce the level of opacity surrounding this innovative and rapidly evolving technology.

Three case studies were highlighted:

1. “One-click Biodata Solution” by Chemonics 

  • Chemonics’ Blockchain for Development Solutions Lab designed and implemented a RegTech solution for the USAID foreign assistance and contracting space that sought to leverage the blockchain-based identity platform created by BanQu to dramatically expedite and streamline the collection and verification of USAID biographical data sheets (biodatas), improve personal data protection, and reduce incidents of error and fraud in the hiring process for professionals and consultants hired under USAID contracts.
  • Chemonics processes several thousand biodatas per year and accordingly devotes significant labor effort and cost to support the current paper-based workflow.
  • Chemonics’ technology partner, BanQu, used a private, permissioned blockchain on the Ethereum network to pilot a biodata solution.
  • Chemonics successfully piloted the solution with BanQu, resulting in 8 blockchain-based biodatas being fully processed in compliance with donor requirements.
  • Improved data protection was a priority for the pilot. One goal of the solution was to make it possible for individuals to maintain control over their back-up documentation, like passports, diplomas, and salary information, which could be shared temporarily with Chemonics through the use of an encrypted key, rather than having documentation emailed and saved to less secure corporate digital file systems.
  • Following the pilot, Chemonics determined through qualitative feedback that users across the biodata ecosystem found the blockchain solution to be easy to use and succeeded at reducing level of effort on the biodata completion process. 
  • Chemonics also compiled lessons-learned, including refinements to the technical requirements, options to scale the solution, and additional user feedback and concerns about the technology to inform decision-making around further biodata pilots. 

2. Project i2i presented by Consensys

  • Problem Statement: 35% of the Filipino population is unbanked, and 56% lives in rural areas. The Philippines economy relies heavily on domestic remittances. Unionbank sought to partner with hundreds of rural banks that didn’t have access to electronic banking services that the larger commercial banks do.
  • In 2017, to continue the Central Bank of the Philippines’ national strategy for financial inclusion, the central banks of Singapore and the Philippines announced that they would collaborate on financial technology by employing the regulatory sandbox approach. This will provide industry stakeholders with the room and time to experiment before regulators enact potentially restrictive policies that could stifle innovation and growth. As part of the agreement, the central banks will share resources, best practices, research, and collaborate to “elevate financial innovation” in both economies.
  • Solution design assumptions for Philippines context:
    • It can be easily operated and implemented with limited integration, even in low-tech settings;
    • It enables lower transaction time and lower transaction cost;
    • It enables more efficient operations for rural banks, including reduction of reconciliations and simplification of accounting processes.
  • Unionbank worked with ConsenSys and participating rural banks to create an interbank ledger with tokenization. The payment platform is private, Ethereum-based.
  • In the initial pilot, 20 steps were eliminated in the process.
  • Technology partners: ConsenSys, Azure (Microsoft), Kaleido, Amazon Web Services.
  • In follow up to the i2i project, Union bank partnered with Singapore-based OCBC Bank, wherein the parties deployed the Adhara liquidity management and international payments platform for a blockchain-based international remittance pilot.  
  • Potential for national and regional collaboration/network development.
  • For details on the i2i project, download the full case study here, watch the 4-minute video clip.

3. Controlled Capture presented by Truepic

  • Truepic is a technology company specializing in digital image and video authentication. Truepic’s Controlled Capture technology uses cutting-edge computer vision, AI, and cryptography technologies to test images and video for signs of manipulation, designating only those that pass its rigorous verification tests are authenticated. Through the public blockchain, Truepic creates an immutable record for each photo and video captured through this process, such that their authenticity can be proven, meeting the highest evidentiary standards. This technology has been used in over 100 countries by citizen journalists, activists, international development organizations, NGOs, insurance companies, lenders and online platforms. 
  • One of Truepic’s innovative strategic partners, the UN Capital Development Fund (another participant of the roundtable), has been testing the possibility of using this technology for monitoring and evaluation of development projects. For example, the following Truepic tracks the date, time, and geolocation of the latest progress of a factory in Uganda. 
  • Controlled Capture requires Wifi or at least 3G/4G connectivity to fully authenticate images/video and write them to the public blockchain, which can be a challenge in low connectivity instances, for example in least-developed countries for UNCDF. 
  • As a work around to connectivity issues, Truepic’s partners have used Satellite Internet connections – such as a Thuraya or Iridium device to successfully capture verified images anywhere. 
  • Public blockchain – Truepic is currently using two different public blockchains, testing cost versus time in an effort to continually shorten the time from capture to closing chain of custody (currently around 8-12 seconds). 
  • Cost – The blockchain component is not actually too expensive; the heaviest investment is in the computer vision technology used to authenticate the images/video, for example to detect rebroadcasting, as in taking a picture of a picture to pass off the metadata.
  • Rights of the image is the owner’s – Truepic does not have rights over the image/video but keeps a copy on its servers in case the user’s phone/tablet is lost, stolen, or broken. And most importantly, so that Truepic can produce the original image on its verification page when shared or disseminated publicly. 
  • Court + evidentiary value: the technology and public-facing verification pages are designed to meet the highest evidentiary standards. 
    • Tested in courts; currently being testing at the international level but cannot disclose specifics due to confidentiality reasons.
  • Privacy and security are key priorities, especially for working in conflict zones, such as Syria. Truepic does not use 2-step authentication because the technology is focused on authenticating the images/video; it is not relevant who the source is and this way it keeps the source as anonymous as possible. Truepic works with its partners to educate on best practices to maintain high levels of anonymity in any scenario. 
  • Biggest challenge is usage by implementing partners – it is very easy to use, however the behavioral change to use the platform has been challenging. 
    • Other challenge: you bring the solution to an implementer, and the implementer says you have to get the donor to integrate it into their RFP scopes; then the donors recommend that we speak to implementing partners. 
  • Storage capacity issues? Storage is not currently a problem; Truepic has plans in place to address any storage issues that may arise with scale. 

How did implementers measure success in their blockchain pilots?

  • Measurement was both quantitative and qualitative 
  • The organizations worked with clients to ensure people who needed the MEL were able to access and use it
  • Concerns with publicizing information or difficulties with NDAs were handled on a case-by-case basis

The MEL space is an excellent place to have a conversation about the use of blockchain for international development – many aspects of MEL hinge on the need for immutability (in record keeping), transparency (in the expenditure and impact of funds) and security (in the data and the identities of implementers and beneficiaries). Many use cases in developing countries and for social impact have been documented (see Stanford report Blockchain for Social Impact, Moving Beyond the Hype). (Editor’s note: see also Blockchain and Distributed Ledger Technologies in the Humanitarian Sector and Distributed Ledger Identification Systems in the Humanitarian Sector).

The original search for evidence on the impact of blockchain sought a level of data fidelity that is difficult to capture and validate, even under the least challenging circumstances. Not finding it at that time, the research team sought the next best solution, which was not to discount the technology, but to suggest ways to cope with the knowledge gaps they encountered by recommending a learning agenda. The roundtable helped to stimulate robust conversation of the three case studies, contributing to that learning agenda.

Most importantly, the experience highlighted several interesting takeaways about innovation in public-private partnerships more broadly: 

  • The initial MERL Tech session publicly and transparently drew attention to the gaps that were identified from the researchers’ thirty thousand-foot view of evaluating innovation. 
  • This transparency drew out engagement and collaboration between and amongst those best-positioned to move quickly and calibrate effectively with the government’s needs: the private sector. 
  • This small discussion that focused on the utility and promise of blockchain highlighted the broader role of government (as funder/buyer/donor) in both providing the problem statement and anchoring the non-governmental, private sector, and civil society’s strengths and capabilities. 

One year later…

So, a year after the much-debated blockchain blogpost, what has changed? A lot. There is a growing body of reporting that adds to the lessons learned literature and practical insights from projects that were powered or supported by blockchain technology. The question remains: do we have any greater documentation or evidence of the results blockchain was purported to have achieved in these claims? It seems that while reporting has improved, it still has a long way to go. 

It’s worth pointing out that the international development industry, with far more experts and funding dedicated to working on improving MERL than emerging tech companies, also has some distance to go in meeting its own evidence standards.  Fortunately, the volume and frequency of hype seems to have decreased (or perhaps the news cycle has simply moved on?), thereby leaving blockchain (and its investors and developers) the space they need to refine the technology.

In closing, we, like the co-authors of the 2018 post, remain optimistic that blockchain, a still emerging technology, will be given the time and space needed to mature and prove its potential. And, whether you believe in “crypto-winter” or not, hopefully the lull in the hype cycle will prove to be the breathing space that blockchain needs to keep evolving in a productive direction.

Author Bios

Shailee Adinolfi: Shailee works on Public Sector solutions at ConsenSys, a global blockchain technology company building the infrastructure, applications, and practices that enable a decentralized world. She has 20 years of experience at the intersection of technology, financial inclusion, trade, and government, including 11 years on USAID funded projects in Africa, Asia and the Middle East.

John Burg: John was a co-author on the original MERL Tech DC 2018 blog, referenced in this blog. He is an international development professional with almost 20 years of cross-sectoral experience across 17 countries in six global regions. He enjoys following the impact of emerging technology in international development contexts.

Tara Vassefi: Tara is Truepic’s Washington Director of Strategic Initiatives. Her background is as a human rights lawyer where she worked on optimizing the use of digital evidence and understanding how the latest technologies are used and weighed in courts around the world. 

3 Lessons Learned using Machine Learning to Measure Media Quality

by Samhir Vasdev, Technical Adviser for Digital Development at IREX’s Center for Applied Learning and Impact. The post 3 Lessons Learned using Machine Learning to Measure Media Quality appeared first on ICTworks.

Moving from hype to practice is an important but challenging step for ICT4D practitioners. As the technical adviser for digital development at IREX, a global development and education organization, I’ve been watching with cautious optimism as international development stakeholders begin to explore how artificial intelligence tools like machine learning can help them address problems and introduce efficiencies to amplify their impact.

So while USAID was developing their guide to making machine learning work for international development and TechChange rolled out their new course on Artificial Intelligence for International Development, we spent a few months this summer exploring whether we could put machine learning to work to measure media quality.

Of course, we didn’t turn to machine learning just for the sake of contributing to the “breathless commentary of ML proponents” (as USAID aptly puts it).

As we shared in a session with our artificial intelligence partner Lore at MERLTech DC 2018, some of our programs face a very real set of problems that could be alleviated through smarter use of digital tools.

Our Machine Learning Experiment

In our USAID-funded Media Strengthening Program in Mozambique, for example, a small team of human evaluators manually score thousands of news articles based on 18 measures of media quality.

This process is time consuming (some evaluators spend up to four hours a day reading and evaluating articles), inefficient (when staff turns over, we need to reinvest resources to train up new hires), and inconsistent (even well-trained evaluators might score articles differently).

To test whether we can make the process of measuring media quality less resource-intensive, we spent a few months training software to automatically detect one of these 18 measures of media quality: whether journalists keep their own opinions out of their news articles. The results of this experiment are very compelling:

  • The software had 95% accuracy in recognizing sentences containing opinions within the dataset of 1,200 articles.
  • The software’s ability to “learn” was evident. Anecdotally, the evaluation team noticed a marked improvement in the accuracy of the software’s suggestions after showing it only twenty sentences that had opinions. The accuracy, precision, and recall results highlighted above were achieved after only sixteen rounds of training the software.
  • Accuracy and precision increased the more that the model was trained. There is a clear relationship between the number of times the evaluators trained the software and the accuracy and precision of the results. The recall results did not improve over time as consistently.

These results, although promising, simplify some numbers and calculations. Check out our full report for details.

What does this all mean? Let’s start with the good news. The results suggest that some parts of media quality—specifically, whether an article is impartial or whether it echoes its author’s opinions—can be automatically measured by machine learning.

The software also introduces the possibility of unprecedented scale, scanning thousands of articles in seconds for this specific indicator. These implications introduce ways for media support programs to spend their limited resources more efficiently.

3 Lessons Learned from using Machine Learning

Of course, the machine learning experience was not without problems. With any cutting-edge technology, there will be lessons we can learn and share to improve everyone’s experience. Here are our three lessons learned working with machine learning:

1. Forget about being tech-literate; we need to be more problem-literate.

Defining a coherent, specific, actionable problem statement was one of the important steps of this experiment. This wasn’t easy. Hard trade-offs had to be made (Which of 18 indicators should we focus on?), and we had to focus on things we could measure in order to demonstrate efficiency games using this new approach (How much time do evaluators currently spend scoring articles?).

When planning your own machine learning project, devote plenty of time at the outset—together with your technology partner—to define the specific problem you’ll try to address. These conversations result in a deeper shared understanding of both the sector and the technology that will make the experiment more successful.

2. Take the time to communicate results effectively.

Since completing the experiment, people have asked me to explain how “accurate” the software is. But in practice, machine learning software uses different methods to define “accuracy”, which in turn can vary according to the specific model (the software we used deploys several models).

What starts off as a simple question (How accurate is our software?) can easily turn into a discussion of related concepts like precision, recall, false positives, and false negatives. We found that producing clean visuals (like this or this) became the most effective way to explain our results.

3. Start small and manage expectations.

Stakeholders with even a passing awareness of machine learning will be aware of its hype. Even now, some colleagues ask me how we “automated the entire media quality assessment process”—even though we only used machine learning to identify one of 18 indicators of media quality. To help mitigate inflated expectations, we invested a small amount into this “minimum viable product” (MVP) to prove the fundamental concept before expanding on it later.

Approaching your first machine learning project this way might help to keep expectations in line with reality, minimize risks associated with experimentation, and provide air cover for you to adjust your scope as you discover limitations or adjacent opportunities during the process.

How does GlobalGiving tell whether it’s having an impact?

by Nick Hamlin, Data Scientist at Global Giving. This post was originally published here on October 1, 2018, titled “How Can We Tell if GlobalGiving is Making an Impact,” The full study can be found here.

Our team wanted to evaluate our impact, so we applied a new framework to find answers.


What We Tested

Every social organization, GlobalGiving included, needs to know if it’s having an impact on the communities it serves. For us, that means understanding the ways in which we are (or aren’t!) helping our nonprofit partners around the world improve their own effectiveness and capacity to create change, regardless of the type of work they do.

Why It Matters

Without this knowledge, social organizations can’t make informed decisions about the strategies to use to deliver their services. Unfortunately, this kind of rigorous impact evaluation is usually quite expensive and can take years to carry out. As a result, most organizations struggle to evaluate their impact.

We knew the challenges going into our own impact research would be substantial, but it was too important for us not to try.

The Big Question

Do organizations with access to GlobalGiving’s services improve their performance differently than organizations that don’t? Are there particular focus areas where GlobalGiving is having more of an impact than others?

Our Method

Ideally, we’d randomly assign certain organizations to receive the “treatment” of being part of GlobalGiving and then compare their performance with another randomly assigned control group. But, we can’t just tell random organizations that they aren’t allowed to be part of our community. So, instead we compared a treatment group—organizations that have completed the GlobalGiving vetting process and become full partners on the website—with a control group of organizations that have successfully passed the vetting process but haven’t joined the web community. Since we can’t choose these groups randomly, we had to ensure the organizations in each group are as similar as possible so that our results aren’t biased by underlying differences between the control and treatment groups.

To do this, we worked only with organizations based in India. We chose India because we have lots of relationships with organizations there, and we needed as large a sample size as possible to increase confidence that our conclusions are reliable. India is also well-suited for this study because it requires organizations to have special permission to receive funds from overseas under the Foreign Contribution Regulation Act (FCRA). Organizations must have strong operations in place to earn this permission. The fact that all participant organizations are established enough to earn both an FCRA certification and pass GlobalGiving’s own high vetting standards means that any differences in our results are unlikely to be caused by geographic or quality differences.

We also needed a way to measure nonprofit performance in a concrete way. For this, we used the “Organizational Performance Index” (OPI) framework created by Pact. The OPI provides a structured way to understand a nonprofit’s capacity along eight different categories, including its ability to deliver programs, the diversity of its funding sources, and its use of community feedback. The OPI scores organizations on a scale of 1 (lowest) to 4 (highest). With the help of a fantastic team of volunteers in India, we gathered two years of OPI data from both the treatment and control groups, then compared how their scores changed over time to get an initial indicator of GlobalGiving’s impact.

The Results

The most notable result we found was that organizations that were part of GlobalGiving demonstrated significantly more participatory planning and decision-making processes (what we call “community leadership”), and improved their use of stakeholder feedback to inform their work, in comparison to control group organizations. We did not see a similar significant result in the other seven categories that the OPI tracks. The easiest way to see this result is to visualize how organizations’ scores shifted over time. The chart below shows differences in target population scores—Pact’s wording for “community leadership and feedback.”

Differences in Target Population Score Changes

Differences in Target Population Score

For example, look at the organizations that started out with a score of two in the control group on the left. Roughly one third of those increased their score to three, one third stayed the same, and one third had their scores drop to one. In contrast, in the treatment group on the right, nearly half the organizations increased their scores and about half stayed the same, while only a tiny fraction dropped. You can see a similar pattern across the two groups regardless of their starting score.

In contrast, here’s the same diagram for another OPI category where we didn’t see a statistically significant difference between the two groups. There’s not nearly as clear a pattern—both the treatment and control organizations change their scores about the same amount.

Differences in Delivery Score Changes

GlobalGiving Impact Study Delivery Score Changes

For more technical details about our research design process, our statistical methodology, and the conclusions we’ve drawn, please check out the full write-up of this work, which is available on the Social Science Research Network.

The Ultimate Outcome

GlobalGiving spends lots of time focusing on helping organizations use feedback to become more community-led, because we believe that’s what delivers greater impact.

Our initial finding—that our emphasis on feedback is having a measurable impact—is an encouraging sign.

On the other hand, we didn’t see that GlobalGiving was driving significant changes in any of the other seven OPI categories. Some of these categories, like adherence to national or international standards, aren’t areas where GlobalGiving focuses much. Others, like how well an organization learns over time, are closely related to what we do (Listen, Act, Learn. Repeat. is one of our core values). We’ll need to continue to explore why we’re not seeing results in these areas and, if necessary, make adjustments to our programs accordingly.

Make It Yours

Putting together an impact study, even a smaller one like this, is a major undertaking for any organization. Many organizations talk about applying a more scientific approach to their impact, but few nonprofits or funders take on the challenge of carrying out the research needed to do so. This study demonstrates how organizations can make meaningful progress towards rigorously measuring impact, even without a decade of work and an eight-figure budget.

If your organization is considering something similar, here are a few suggestions to keep in mind that we’ve learned as a result of this project:

1. If you can’t randomize, make sure you consider possible biases.

    •  Logistics, processes, and ethics are all reasons why an organization might not be able to randomly assign treatment groups. If that’s the case for you, think carefully about the rest of your design and how you’ll reduce the chance that a result you see can be attributed to a different cause.

2. Choose a measurement framework that aligns with your theory of change and is precise as possible.

    •  We used the OPI because it was easy to understand, reliable, and well-accepted in the development sector. But, the OPI’s four-level scale made it difficult to make precise distinctions between organizations, and there were some categories that didn’t make sense in the context of how GlobalGiving works. These are areas we’ll look to improve in future versions of this work.

3. Get on the record. 

    • Creating a clear record of your study, both inside and outside your organization, is critical for avoiding “scope creep.” We used Git to keep track of all changes in our data, code, and written analysis, and shared our initial study design at the 2017 American Evaluation Association conference.

4. Enlist outside help. 

    This study would not have been possible without lots of extra help, from our volunteer team in India, to our friends at Pact, to the economists and data scientists who checked our math, particularly Alex Hughes at UC Berkeley and Ted Dunmire at Booz Allen Hamilton.

We’re pleased about what we’ve learned about GlobalGiving’s impact, where we can improve, and how we might build on this initial work, and we can’t wait to continue to build on this progress moving forward in service of improved outcomes for our nonprofit partners worldwide.

Find the full study here.

Blockchain: the ultimate solution?

by Ricardo Santana, MERL Practitioner

I had the opportunity during MERL Tech London 2018 to attend a very interesting session to discuss blockchains and how can they be applied in the MERL space. This session was led by Valentine Gandhi, Founder of The Development CAFÉ, Zara Rahman, Research and Team Lead at the The Engine Room, and Wayan Vota, Co-founder of Kurante.

The first part of the session was an introduction to blockchain, which is basically an distributed ledger system. Why is it an interesting solution? Because the geographically distributed traces left in multiple devices make for a very robust and secure system. It is not possible to take a unilateral decision to scrap or eliminate data because it would be reflected in the distributed constitution of the data chain. Is it possible to corrupt the system? Well, yes, but what makes it robust and secure is that for that to happen, every single person participating in the blockchain system must agree to do so.

That is the powerful innovation of the technology. It remains somehow to the torrents of technology to share files:  it is very hard to control this when your file storage is not in a single server but rather in an enormous number of end-user terminals.

What I want to share from this session, however, is not how the technology works! That information is readily available on the Internet and other sources.

What I really found interesting was the part of the session where professionals interested in blockchain shared our doubts and the questions that we would need to clarify in order to decide whether blockchain technology would be required or not.

Some of the most interesting shared doubts and concerns around this technology were:

What sources of training and other useful resources are available if you want to implement blockchain?

  • Say the organization or leadership team decides that a blockchain is required for the solution. I am pretty sure it is not hard to find information about blockchain on the Internet, but we all face the same problem — the enormous amount of information available makes it tricky to reach the holy grail that provides just enough information without losing hours to desktop research. It would be incredibly beneficial to have a suggested place where this info can be find, even more if it were a specialized guide aimed at the MERL space.

What are the data space constraints?

  • I found this question very important. It is a key aspect of the design and scalability of the solution. I assume that it will not be an important amount of data but I really don’t know. And maybe it is not a significant amount of information for a desktop or a laptop, but what if we are using cell phones as end terminals too? This need to be addressed so the design is based on facts and not assumptions.

Use cases.

  • Again, there are probably a lot of them to be found all over the Internet, but they are hardly going to be insightful for a specific MERL approach. Is it possible to have a repository of relevant cases for the MERL space?

When is blockchain really required?

  • It would be really helpful to have a simple guide that helps any professional clarify whether the volume or importance of the information is worth the implementation of a Blockchain system or not.

Is there a right to be forgotten in Blockchain?

  • Recent events give a special relevance to this question. Blockchains are very powerful to achieve traceability, but what if I want my information to be eliminated because it is simply my right? This is an important aspect in technologies that have a distributed logic. How to use the powerful advantages of blockchain while allocating the individual rights of every single person to take unilateral decisions on their private or personal information?

I am not an expert in the matter but I do recognize the importance of these questions and the hope is that the people able to address them can pick them up and provide useful answers and guidance to clarify some or all of them.

If you have answers to these questions, or more questions about blockchain and MERL, please add them in the comments!

If you’d like to be a part of discussions like this one, register to attend the next MERL Tech conference! MERL Tech Jozi is happening August 1-2, 2018 and we just opened up registration today! MERL Tech DC is coming up September 6-7. Today’s the last day to submit your session ideas, so hurry up and fill out the form if you have an idea to present or share!

 

 

MERL Tech London 2018 Agenda is out!

We’ve been working hard over the past several weeks to finish up the agenda for MERL Tech London 2018, and it’s now ready!

We’ve got workshops, panels, discussions, case studies, lightning talks, demos, community building, socializing, and an evening reception with a Fail Fest!

Topics range from mobile data collection, to organizational capacity, to learning and good practice for information systems, to data science approaches, to qualitative methods using mobile ethnography and video, to biometrics and blockchain, to data ethics and privacy and more.

You can search the agenda to find the topics, themes and tools that are most interesting, identify sessions that are most relevant to your organization’s size and approach, pick the session methodologies that you prefer (some of us like participatory and some of us like listening), and to learn more about the different speakers and facilitators and their work.

Tickets are going fast, so be sure to snap yours up before it’s too late! (Register here!)

View the MERL Tech London schedule & directory.

 

DataDay TV: MERL Tech Edition

What data superpower would you ask for? How would you describe data to your grandparents? What’s the worst use of data you’ve come across? 

These are a few of the questions that TechChange’s DataDay TV Show tackles in its latest episode.

The DataDay Team (Nick Martin, Samhir Vasdev, and Priyanka Pathak) traveled to MERL Tech DC last September to ask attendees some tough data-related questions. They came away with insightful, unusual, and occasionally funny answers….

If you’re a fan of discussing data, technology and MERL, join us at MERL Tech London on March 19th and 20th. 

Tickets are going fast, so be sure to register soon if you’d like to attend!

If you want to take your learning to the next level with a full-blown course, TechChange has a great 2018 schedule, including topics like blockchain, AI, digital health, data visualization, e-learning, and more. Check out their course catalog here.

What about you, what data superpower would you ask for?

 

M&E software – 8 Tips on How to Talk to IT Folks

.

You want to take your M&E system one step further and introduce a proper M&E software? That’s great, because a software has the potential of making the monitoring process more efficient and transparent, reducing errors and getting more accurate data. But how to go about it? You have three options:

  1. You build your own system, for example in Microsoft Excel;
  2. You purchase an M&E software package off-the-shelf;
  3. You hire an IT consultant to set up a customized M&E system according to your organization’s specific requirements.

If options one and two do not work out for you, you can hire consultants to develop a solution for you. You will probably start a public tender to find the most suitable IT company to entrust with this task. While there a lot of things to pay attention to when formulating the Terms of Reference (TOR), I would like to give you some tips specifically about the communication with the hired IT consultants. These insights come from years of experience of being on both sides: The party who wants a tool and needs to describe it to the implementing programmers and being the IT guy (or rather lady) who implements Excel and web-based database tools for M&E.

To be on the safe side, I recommend you to work with this assumption: IT consultants have no clue about M&E. There are few IT companies who come from the development sector, like energypedia consult does, and are familiar with M&E concepts such as indicators, logframes and impact chains. To still get what you need, you should pay attention to the following communication tips:

  1. Take your time explaining what you need: Writing TOR takes time – but it takes even longer and becomes more costly when you hire somebody for something that is not thought through. If you don’t know all the details right from the start, get some expert assistance in formulating terms – it’s worthwhile.
  2. Use graphs: Instead of using words to describe your monitoring logic and the system you need, it is much easier to make graphs to depict the structure, user groups, linking of information, flow of monitoring data etc.
  3. Give examples: When unsure about how to put a feature into words, send a link or a screenshot of the function that you might have come across elsewhere and wish to have in your tool.
  4. Explain concepts and terminology: Many results frameworks work with the terms “input” and “output”. Most IT guys, however, will not have equipment and finished schools in mind, but rather data flows that consist of inputs and outputs. Make sure you clarify this. Also, the term web-based or web monitoring itself is a source of misunderstanding. In the IT world, web monitoring refers to monitoring activity in the internet, for example website visits or monitoring a server. That is probably not what you want when building up an M&E system for e.g. a good governance programme.
  5. Meet in person: In your budget calculation, allow for at least one workshop where you meet in person, for example a kick-off workshop in which you clarify your requirements. This is not only a possibility to ask each other questions, but also to get a feeling of the other party’s language and way of thinking.
  6. Maintain a dialogue: During the implementation phase, make sure to stay in regular touch with the programmers. Ask them to show you updates every once in a while to allow you to give feedback. When you detect that the programmers are heading into the wrong direction, you want to find out rather sooner than later.
  7. Document communication: When we implement web-based systems, we typically create a page within the web platform itself that outlines all the agreed steps. This list serves as a to-do list and an implementation protocol at the same time. It facilitates communication, particularly when on both sides multiple persons are involved that are not always present in all meetings or phone calls.
  8. Be prepared for misunderstandings: They happen. It’s normal. Plan for some buffer days before launching the final tool.

In general, the implementation phase should allow for some flexibility. As both parties learn from each other during the process, you should not be afraid to adjust initial plans, because the final tool will benefit greatly from it (if the contract has some flexibility). Big customized IT projects take some time.

If you need more advice on this matter and some more insights on setting up IT-based M&E systems, please feel free to contact me any time! In the past we supported some clients by setting up a prototype for their web-based M&E system with our flexible WebMo approach. During the prototype process the client learnt a lot and afterwards it was quite easy for other developers to copy the prototype and migrate it to their e.g. Microsoft Share Point environment (in case your IT guys don’t believe in Open Source or don’t want to host third-party software on their server).

Please leave your comments, if you think that I have missed an important communication rule.

Good luck!

Qualitative Coding: From Low Tech to High Tech Options

by Daniel Ramirez-Raftree, MERL Tech volunteer

In their MERL Tech DC session on qualitative coding, Charles Guedenet and Anne Laesecke from IREX together with Danielle de Garcia of Social Impact offered an introduction to the qualitative coding process followed by a hands-on demonstration on using Excel and Dedoose for coding and analyzing text.

They began by defining content analysis as any effort to make sense of qualitative data that takes a volume of qualitative material and attempts to identify core consistencies and meanings. More concretely, it is a research method that uses a set of procedures to make valid inferences from text. They also shared their thoughts on what makes for a good qualitative coding method.

Their belief is that: it should

  • consider what is already known about the topic being explored
  • be logically grounded in this existing knowledge
  • use existing knowledge as a basis for looking for evidence in the text being analyzed

With this definition laid out, they moved to a discussion about the coding process where they elaborated on four general steps:

  1. develop codes and a codebook
  2. decide on a sampling plan
  3. code your data
  4. go back and do it again!
  5. test for reliability

Developing codes and a codebook is important for establishing consistency in the coding process, especially if there will be multiple coders working on the data. A good way to start developing these codes is to consider what is already known. For example, you can think about literature that exists on the subject you’re studying. Alternatively, you can simply turn to the research questions the project seeks to answer and use them as a guide for creating your codes. Beyond this, it is also useful to go through the content and think about what you notice as you read. Once a codebook is created, it will lend stability and some measure of objectivity to the project.

The next important issue is the question of sampling. When determining sample size, though a larger sample will yield more robust results, one must of course consider the practical constraints of time, cost and effort. Does the benefit of higher quality results justify the additional investment? Fortunately, the type of data will often inform sampling. For example, if there is a huge volume of data, it may be impossible to analyze it all, but it would be prudent to sample at least 30% of it. On the other hand, usually interview and focus group data will all be analyzed, because otherwise the effort of obtaining the data would have gone to waste.

Regarding sampling method, session leads highlighted two strategies that produce sound results. One is systematic random sampling and the other is quota sampling–a method employed to ensure that the proportions of demographic group data are fairly represented.

Once these key decisions have been made, the actual coding can begin. Here, all coders should work from the same codebook and apply the codes to the same unit of analysis. Typical units of analysis are: single words, themes, sentences, paragraphs, and items (such as articles, images, books, or programs). Consistency is essential. A way to test the level of consistency is to have a 10% overlap in the content each coder analyzes and aim for 80% agreement between their coding of that content. If the coders are not applying the same codes to the same units this could either mean that they are not trained properly or that the code book needs to be altered.

Along a similar vein, the fourth step in the coding process is to test for reliability. Challenges in producing stable and consistent results in coding could include: using a unit of analysis that is too large for a simple code to be reliably applied, coding themes or concepts that are ambiguous, and coding nonverbal items. For each of these, the central problem is that the units of analysis leave too much room for subjective interpretation that can introduce bias. Having a detailed codebook can help to mitigate against this.

After giving an overview of the coding process, the session leads suggested a few possible strategies for data visualization. One is to use a word tree, which helps one look at the context in which a word appears. Another is a bubble chart, which is useful if one has descriptive data and demographic information. Thirdly, correlation maps are good for showing what sorts of relationships exist among the data. The leads suggested visiting the website stephanieevergreen.com/blog for more ideas about data visualization.

Finally, the leads covered low-tech and high-tech options for coding. On the low-tech end of the spectrum, paper and pen get the job done. They are useful when there are few data sources to analyze, when the coding is simple, and when there is limited tech literacy among the coders. Next up the scale is Excel, which works when there are few data sources and when the coders are familiar with Excel. Then the session leads closed their presentation with a demonstration of Dedoose, which is a qualitative coding tool with advanced capabilities like the capacity to code audio and video files and specialized visualization tools. In addition to Dedoose, the presenters mentioned Nvivo and Atlas as other available qualitative coding software.

Despite the range of qualitative content available for analysis, there are a few core principles that can help ensure that it is analyzed well, these include consistency and disciplined methodology. And if qualitative coding will be an ongoing part of your organization’s operations, there are several options for specialized software that are available for you to explore. [Click here for links and additional resources from the session.]

Data quality in the age of lean data

by Daniel Ramirez-Raftree, MERL Tech support team.

Evolving data collection methods call for evolving quality assurance methods. In their session titled Data Quality in the Age of Lean Data, Sam Schueth of Intermedia, Woubedle Alemayehu of Oxford Policy Management, Julie Peachey of the Progress out of Poverty Index, and Christina Villella of MEASURE Evaluation discussed problems, solutions, and ethics related to digital data collection methods. [Bios and background materials here]

Sam opened the conversation by comparing the quality assurance and control challenges in paper assisted personal interviewing (PAPI) to those in digital assisted personal interviewing (DAPI). Across both methods, the fundamental problem is that the data that is delivered is a black box. It comes in, it’s turned into numbers and it’s disseminated, but in this process alone there is no easily apparent information about what actually happened on the ground.

During the age of PAPI, this was dealt with by sending independent quality control teams to the field to review the paper questionnaire that was administered and perform spot checks by visiting random homes to validate data accuracy. Under DAPI, the quality control process becomes remote. Survey administrators can now schedule survey sessions to be recorded automatically and without the interviewer’s knowledge, thus effectively gathering a random sample of interviews that can give them a sense of how well the sessions were conducted. Additionally, it is now possible to use GPS to track the interviewers’ movements and verify the range of households visited. The key point here is that with some creativity, new technological capacities can be used to ensure higher data quality.

Woubedle presented next and elaborated on the theme of quality control for DAPI. She brought up the point that data quality checks can be automated, but that this requires pre-survey-implementation decisions about what indicators to monitor and how to manage the data. The amount of work that is put into programming this upfront design has a direct relationship on the ultimate data quality.

One useful tool is a progress indicator. Here, one collects information on trends such as the number of surveys attempted compared to those completed. Processing this data could lead to further questions about whether there is a pattern in the populations that did or did not complete the survey, thus alerting researchers to potential bias. Additionally, one can calculate the average time taken to complete a survey and use it to identify outliers that took too little or too long to finish. Another good practice is to embed consistency checks in the survey itself; for example, making certain questions required or including two questions that, if answered in a particular way, would be logically contradictory, thus signaling a problem in either the question design or the survey responses. One more practice could be to apply constraints to the survey, depending on the households one is working with.

After this discussion, Julie spoke about research that was done to assess the quality of different methods for measuring the Progress out of Poverty Index (PPI). She began by explaining that the PPI is a household level poverty measurement tool unique to each country. To create it, the answers to 10 questions about a household’s characteristics and asset ownership are scored to compute the likelihood that the household is living below the poverty line. It is a simple, yet effective method to evaluate household level poverty. The research project Julie described set out to determine if the process of collecting data to create the PPI could be made less expensive by using SMS, IVR or phone calls.

Grameen Foundation conducted the study and tested four survey methods for gathering data: 1) in-person and at home, 2) in-person and away from home, 3) in-person and over the phone, and 4) automated and over the phone. Further, it randomized key aspects of the study, including the interview method and the enumerator.

Ultimately, Grameen Foundation determined that the interview method does affect completion rates, responses to questions, and the resulting estimated poverty rates. However, the differences in estimated poverty rates was likely not due to the method itself, but rather to completion rates (which were affected by the method). Thus, as long as completion rates don’t differ significantly, neither will the results. Given that the in-person at home and in-person away from home surveys had similar completion rates (84% and 91% respectively), either could be feasibly used with little deviation in output. On the other hand, in-person over the phone surveys had a 60% completion rate and automated over the phone surveys had a 12% completion rate, making both methods fairly problematic. And with this understanding, developers of the PPI have an evidence-based sense of the quality of their data.

This case study illustrates the the possibility of testing data quality before any changes are made to collection methods, which is a powerful strategy for minimizing the use of low quality data.

Christina closed the session with a presentation on ethics in data collection. She spoke about digital health data ethics in particular, which is the intersection of public health ethics, clinical ethics, and information systems security. She grounded her discussion in MEASURE Evaluation’s experience thinking through ethical problems, which include: the vulnerability of devices where data is collected and stored, the privacy and confidentiality of the data on these devices, the effect of interoperability on privacy, data loss if the device is damaged, and the possibility of wastefully collecting unnecessary data.

To explore these issues, MEASURE conducted a landscape assessment in Kenya and Tanzania and analyzed peer reviewed research to identify key themes for ethics. Five themes emerged: 1) legal frameworks and the need for laws, 2) institutional structures to oversee implementation and enforcement, 3) information systems security knowledge (especially for countries that may not have the expertise), 4) knowledge of the context and users (are clients comfortable with their data being used?), and 5) incorporating tools and standard operating procedures.

Based in this framework, MEASURE has made progress towards rolling out tools that can help institute a stronger ethics infrastructure. They’ve been developing guidelines that countries can use to develop policies, building health informatic capacity through a university course, and working with countries to strengthen their health information systems governance structures.

Finally, Christina explained her take on how ethics are related to data quality. In her view, it comes down to trust. If a device is lost, this may lead to incomplete data. If the clients are mistrustful, this could lead to inaccurate data. If a health worker is unable to check or clean data, this could create a lack of confidence. Each of these risks can lead to the erosion of data integrity.

Register for MERL Tech London, March 19-20th 2018! Session ideas due November 10th.

MERL Tech and the World of ICT Social Entrepreneurs (WISE)

by Dale Hill, an economist/evaluator with over 35 years experience in development and humanitarian work. Dale led the session on “The growing world of ICT Social Entrepreneurs (WISE): Is social Impact significant?” at MERL Tech DC 2018.

Roger Nathanial Ashby of OpenWise and Christopher Robert of Dobility share experiences at MERL Tech.
Roger Nathanial Ashby of OpenWise and Christopher Robert of Dobility share experiences at MERL Tech.

What happens when evaluators trying to build bridges with new private sector actors meet real social entrepreneurs? A new appreciation for the dynamic “World of ICT Social Entrepreneurs (WISE)” and the challenges they face in marketing, pricing, and financing (not to mention measurement of social impact.)

During this MERL Tech session on WISE, Dale Hill, evaluation consultant, presented grant funded research on measurement of social impact of social entrepreneurship ventures (SEVs) from three perspectives. She then invited five ICT company CEOs to comment.

The three perspectives are:

  • the public: How to hold companies accountable, particularly if they have chosen to be legal or certified “benefit corporations”?
  • the social entrepreneurs, who are plenty occupied trying to reach financial sustainability or profit goals, while also serving the public good; and
  • evaluators, who see the important influence of these new actors, but know their professional tools need adaptation to capture their impact.

Dale’s introduction covered overlapping definitions of various categories of SEVs, including legally defined “benefit corporations”, and “B Corps”, which are intertwined with the options of certification available to social entrepreneurs. The “new middle” of SEVs are on a spectrum between for-profit companies on one end and not-for profit organizations on the other. Various types of funders, including social impact investors, new certification agencies, and monitoring and evaluation (M&E) professionals, are now interested in measuring the growing social impact of these enterprises. A show of hands revealed that representatives of most of these types of actors were present at the session.

The five social entrepreneur panelists all had ICT businesses with global reach, but they varied in legal and certification status and the number of years operating (1 to 11). All aimed to deploy new technologies to non-profit organizations or social sector agencies on high value, low price terms. Some had worked in non-profits in the past and hoped that venture capital rather than grant funding would prove easier to obtain. Others had worked for Government and observed the need for customized solutions, which required market incentives to fully develop.

The evaluator and CEO panelists’ identification of challenges converged in some cases:

  • maintaining affordability and quality when using market pricing
  • obtaining venture capital or other financing
  • worry over “mission drift” – if financial sustainability imperatives or shareholder profit maximization preferences prevail over founders’ social impact goals; and
  • the still present digital divide, when serving global customers (insufficient bandwidth, affordability issues, limited small business capital in some client countries.

New issues raised by the CEOs (and some social entrepreneurs in the audience) included:

  • the need to provide incentives to customers to use quality assurance or security features of software, to avoid falling short of achieving the SEV’s “public good” goals;
  • the possibility of hostile takeover, given high value of technological innovations;
  • the fact that mention of a “social impact goal” was a red flag to some funders who then went elsewhere to seek profit maximization.

There was also a rich discussion on the benefits and costs of obtaining certification: it was a useful “branding and market signal” to some consumers, but a negative one to some funders; also, it posed an added burden on managers to document and report social impact, sometimes according to guidelines not in line with their preferences.

Surprises?

a) Despite the “hype”, social impact investment funding proved elusive to the panelists. Options for them included: sliding scale pricing; establishment of a complementary for-profit arm; or debt financing;

b) Many firms were not yet implementing planned monitoring and evaluation (M&E) programs, despite M&E being one of their service offerings; and

c) The legislation on reporting social impact of benefit corporations among the 31 states varies considerably, and the degree of enforcement is not clear.

A conclusion for evaluators: Social entrepreneurs’ use of market solutions indeed provides an evolving, dynamic environment which poses more complex challenges for measuring social impact, and requires new criteria and tools, ideally timed with an understanding of market ups and downs, and developed with full participation of the business managers.