Tag Archives: technology

Evaluating ICT4D projects against the Digital Principles

By Laura Walker McDonald,  This post was originally published on the Digital Impact Alliance’s Blog on March 29, 2018.

As I have written about elsewhere, we need more evidence of what works and what doesn’t in the ICT4D and tech for social change spaces – and we need to hold ourselves to account more thoroughly and share what we know so that all of our work improves. We should be examining how well a particular channel, tool or platform works in a given scenario or domain; how it contributes to development goals in combination with other channels and tools; how the team selected and deployed it; whether it is a better choice than not using technology or using a different sort of technology; and whether or not it is sustainable.

At SIMLab, we developed our Framework for Monitoring and Evaluation of Technology in Social Change projects to help implementers to better measure the impact of their work. It offers resources towards a minimum standard of best practice which implementers can use or work toward, including on how to design and conduct evaluations. With the support of the Digital Impact Alliance (DIAL), the resource is now finalized and we have added new evaluation criteria based on the Principles for Digital Development.

Last week at MERL Tech London, DIAL was able to formally launch this product by sharing a 2-page summary available at the event and engaging attendees in a conversation about how it could be used. At the event, we joined over 100 organizations to discuss Monitoring, Evaluation, Research and Learning related to technology used for social good.

Why evaluate?

Evaluations provide snapshots of the ongoing activity and the progress of a project at a specific point in time, based on systematic and objective review against certain criteria. They may inform future funding and program design; adjust current program design; or to gather evidence to establish whether a particular approach is useful. They can be used to examine how, and how far, technology contributes to wider programmatic goals. If set up well, your program should already have evaluation criteria and research questions defined, well before it’s time to commission the evaluation.

Evaluation criteria provide a useful frame for an evaluation, bringing in an external logic that might go beyond the questions that implementers and their management have about the project (such as ‘did our partnerships on the ground work effectively?’ or ‘how did this specific event in the host country affect operations?’) to incorporate policy and best practice questions about, for example, protection of target populations, risk management, and sustainability. The criteria for an evaluation could be any set of questions that draw on an organization’s mission, values, principles for action; industry standards or other best practice guidance; or other thoughtful ideas of what ‘good’ looks like for that project or organization. Efforts like the Principles for Digital Development can set useful standards for good practice, and could be used as evaluation criteria.

Evaluating our work, and sharing learning, is radical – and critically important

While the potential for technology to improve the lives of vulnerable people around the world is clear, it is also evident that these improvements are not keeping pace with the advances in the sector. Understanding why requires looking critically at our work and holding ourselves to account. There is still insufficient evidence of the contribution technology makes to social change work. What evidence there is often is not shared or the analysis doesn’t get to the core issues. Even more important, the learnings from what has not worked and why have not been documented and absorbed.

Technology-enabled interventions succeed or fail based on their sustainability, business models, data practices, choice of communications channel and technology platform; organizational change, risk models, and user support – among many other factors. We need to build and examine evidence that considers these issues and that tells us what has been successful, what has failed, and why. Holding ourselves to account against standards like the Principles is a great way to improve our practice, and honor our commitment to the people we seek to help through our work.

Using the Digital Principles as evaluation criteria

The Principles for Digital Development are a set of living guidance intended to help practitioners succeed in applying technology to development programs. They were developed, based on some pre-existing frameworks, by a working group of practitioners and are now hosted by the Digital Impact Alliance.

These nine principles could also form a useful set of evaluation criteria, not unlike OECD evaluation criteria, or Sphere standards. Principles overlap, so data can be used to examine more than one criterion, and ot every evaluation would need to consider all of the Digital Principles.

Below are some examples of Digital Principles and sample questions that could initiate, or contribute to, an evaluation.

Design with the User: Great projects are designed with input from the stakeholders and users who are central to the intended change. How far did the team design the project with its users, based on their current tools, workflows, needs and habits, and work from clear theories of change and adaptive processes?

Understand the Existing Ecosystem: Great projects and programs are built, managed, and owned with consideration given to the local ecosystem. How far did the project work to understand the local, technology and broader global ecosystem in which the project is situated? Did it build on existing projects and platforms rather than duplicating effort? Did the project work sensitively within its ecosystem, being conscious of its potential influence and sharing information and learning?

Build for Sustainability: Great projects factor in the physical, human, and financial resources that will be necessary for long-term sustainability. How far did the project: 1) think through the business model, ensuring that the value for money and incentives are in place not only during the funded period but afterwards, and 2) ensure that long-term financial investments in critical elements like system maintenance and support, capacity building, and monitoring and evaluation are in place? Did the team consider whether there was an appropriate local partner to work through, hand over to, or support the development of, such as a local business or government department?

Be Data Driven: Great projects fully leverage data, where appropriate, to support project planning and decision-making. How far did the project use real-time data to make decisions, use open data standards wherever possible, and collect and use data responsibly according to international norms and standards?

Use Open Standards, Open Data, Open Source, and Open Innovation: Great projects make appropriate choices, based on the circumstances and the sensitivity of their project and its data, about how far to use open standards, open the project’s data, use open source tools and share new innovations openly. How far did the project: 1) take an informed and thoughtful approach to openness, thinking it through in the context of the theory of change and considering risk and reward, 2) communicate about what being open means for the project, and 3) use and manage data responsibly according to international norms and standards?

For a more complete set of guidance, see the complete Framework for Monitoring and Evaluating Technology, and the more nuanced and in-depth guidance on the Principles, available on the Digital Principles website.

Digital Data Collection and the Maturing of a MERL Technology

by Christopher Robert, CEO of Dobility (Survey CTO). This post was originally published on March 15, 2018, on the Survey CTO blog.

Digital data collection: stakeholders and complex relationships

Needs, markets, and innovation combine to produce technological change. This is as true in the international development sector as it is anywhere else. And within that sector, it’s as true in the broad category of MERL (monitoring and evaluation, research, and learning) technologies as it is in the narrower sub-category of digital data collection technologies. Here, I’ll consider the recent history of digital data collection technology as an example of MERL technology maturation – and as an example, more broadly, of the importance of market structure in shaping the evolution of a technology.

My basic observation is that, as digital data collection technology has matured, the same stakeholders have been involved – but the market structure has changed their relative power and influence over time. And it has been these very changes in power and influence that have changed the cost and nature of the technology itself.

First, when it comes to digital data collection in the development context, who are the stakeholders?

  • Donors. These are the primary actors who fund development work, evaluation of development policies and programs, and related research. There are mega-actors like USAID, Gates, and the UN agencies, but also many other charities, philanthropies, and public or nonprofit actors, from Catholic Charities to the U.S. Centers for Disease Control and Prevention.
  • Developers. These are the designers and software engineers involved in producing technology in the space. Some are students or university faculty, some are consultants, many work full-time for nonprofits or businesses in the space. (While some work on open-source initiatives in a voluntary capacity, that seems quite uncommon in practice. The vast majority of developers working on open-source projects in the space get paid for that work.)
  • Consultants and consulting agencies.These are the technologists and other specialists who help research and program teams use technology in the space. For example, they might help to set up servers and program digital survey instruments.
  • Researchers. These are the folks who do the more rigorous research or impact evaluations, generally applying social-science training in public health, economics, agriculture, or other related fields.
  • M&E professionals.These are the people responsible for program monitoring and evaluation. They are most often part of an implementing program team, but it’s also not uncommon to share more centralized (and specialized) M&E teams across programs or conduct outside evaluations that more fully separate some M&E activities from the implementing program team.
  • IT professionals.These are the people responsible for information technology within those organizations implementing international development programs and/or carrying out MERL activities.
  • Program beneficiaries. These are the end beneficiaries meant to be aided by international development policies and programs. The vast majority of MERL activities are ultimately concerned with learning about these beneficiaries.

Digital data collection stakeholders

These different stakeholders have different needs and preferences, and the market for digital data collection technologies has changed over time – privileging different stakeholders in different ways. Two distinct stages seem clear, and a third is coming into focus:

  1. The early days of donor-driven pilots and open source. These were the days of one-offs, building-your-own, and “pilotitis,” where donors and developers were effectively in charge and there was a costly additional layer of technical consultants between the donors/developers and the researchers and M&E professionals who had actual needs in the field. Costs were high, and some combination of donor and developer preferences reigned supreme.
  2. Intensifying competition in program-adopted professional products.Over time, professional products emerged that began to directly market to – and serve – researchers and M&E professionals. Costs fell with economies of scale, and the preferences of actual users in the field suddenly started to matter in a more direct, tangible, and meaningful way.
  3. Intensifying competition in IT-adopted professional products.Now that use of affordable, accessible, and effective data-collection technology has become ubiquitous, it’s natural for IT organizations to begin viewing it as a kind of core organizational infrastructure, to be adopted, supported, and managed by IT. This means that IT’s particular preferences and needs – like scale, standardization, integration, and compliance – start to become more central, and costs unfortunately rise.

While I still consider us to be in the glory days of the middle stage, where costs are low and end-users matter most, there are still plenty of projects and organizations living in that first stage of more costly pilots, open source projects, and one-offs. And I think that the writing’s very much on the wall when it comes to our progression toward the third stage, where IT comes to drive the space, innovation slows, and end-user needs are no longer dominant.

Full disclosure: I myself have long been a proponent of the middle phase, and I am proud that my social enterprise has been able to help graduate thousands of users from that costly first phase. So my enthusiasm for the middle phase began many years ago and in fact helped to launch Dobility.

THE EARLY DAYS OF DONOR-DRIVEN PILOTS AND OPEN SOURCE

Digital data collection stage 1 (the early days)

In the beginning, there were pioneering developers, patient donors, and program or research teams all willing to take risks and invest in a better way to collect data from the field. They took cutting-edge technologies and found ways to fit them into some of the world’s most difficult, least-cutting-edge settings.

In these early days, it mattered a lot what could excite donors enough to open their checkbooks – and what would keep them excited enough to keep the checks coming. So the vital need for large and ongoing capital injections gave donors a lot of influence over what got done.

Developers also had a lot of sway. Donors couldn’t do anything without them, and they also didn’t really know how to actively manage them. If a developer said “no, that would be too hard or expensive” or even “that wouldn’t work,” what could the donor really say or do? They could cut off funding, but that kind of leverage only worked for the big stuff, the major milestones and the primary objectives. For that stuff, donors were definitely in charge. But for the hundreds or thousands of day-to-day decisions that go into any technology solution, it was the developers effectively in charge.

Actual end-users in the field – the researchers and M&E professionals who were piloting or even trying to use these solutions – might have had some solid ideas about how to guide the technology development, but they had essentially no levers of control. In practice, the solutions being built by the developers were often so technically-complex to configure and use that there was an additional layer of consultants (technical specialists) sitting between the developers and the end-users. But even if there wasn’t, the developers’ inevitable “no, sorry, that’s not feasible,” “we can’t realistically fit that into this release,” or simple silence was typically the end of the story for users in the field. What could they do?

Unfortunately, without meaning any harm, most developers react by pushing back on whatever is contrary to their own preferences (I say this as a lifelong developer myself). Something might seem like a hassle, or architecturally unclean, and so a developer will push back, say it’s a bad idea, drag their heels, even play out the clock. In the past five years of Dobility, there have been hundreds of cases where a developer has said something to the effect of “no, that’s too hard” or “that’s a bad idea” to things that have turned out to (a) take as little as an hour to actually complete and (b) provide massive amounts of benefit to end-users. There’s absolutely no malice involved, it’s just the way most of them/us are.

This stage lasted a long time – too long, in my view! – and an entire industry of technical consultants and paid open-source contributors grew up around an approach to digital data collection that didn’t quite embrace economies of scale and never quite privileged the needs or preferences of actual users in the field. Costs were high and complaints about “pilotitis” grew louder.

INTENSIFYING COMPETITION IN PROGRAM-ADOPTED PROFESSIONAL PRODUCTS

Digital data collection stage 2 (the glory days)

But ultimately, the protagonists of the early days succeeded in establishing and honing the core technologies, and in the process they helped to reveal just how much was common across projects of different kinds, even across sectors. Some of those protagonists also had the foresight and courage to release their technologies with the kinds of permissive open-source licenses that would allow professionalization and experimentation in service and support models. A new breed of professional products directly serving research, program, and M&E teams was born – in no small part out of a single, tremendously-successful open-source project, Open Data Kit (ODK).

These products tended to be sold directly to end-users, and were increasingly intended for those end-users to be able to use themselves, without the help of technical staff or consultants. For traditionalists of the first stage, this was a kind of heresy: it was considered gauche at best and morally wrong at worst to charge money for technology, and it was seen as some combination of impossible and naive to think that end-users could effectively deploy and manage these technologies without technical assistance.

In fact, the new class of professional products were not designed to be used entirely without assistance. But they were designed to require as little assistance as possible, and the assistance came with the product instead of being provided by a separate (and separately-compensated) internal or external team.

A particularly successful breed of products came to use a “Software as a Service” (SaaS) model that streamlined both product delivery and support, ramping up economies of scale and driving down costs in the process (like SurveyCTO). When such products offered technical support free-of-charge as part of the purchase or subscription price, there was a built-in incentive to improve the product: since tech support was so costly to deliver, improving the product such that it required less support became one of the strongest incentives driving product development. Those who adopted the SaaS model not only had to earn every dollar of revenue from end-users, but they had to keep earning that revenue month in, month out, year in, year out, in order to retain business and therefore the revenue needed to pay the bills. (Read about other SaaS benefits for M&E in this recent DevResults post.)

It would be difficult to overstate the importance of these incentives to improve the product and earn revenue from end-users. They are nothing short of transformative. Particularly once there is active competition among vendors, users are squarely in charge. They control the money, their decisions make or break vendors, and so their preferences and needs are finally at the center.

Now, in addition to the “it’s heresy to charge money or think that end-users can wield this kind of technology” complaints that used to be more common, there started to be a different kind of complaint: there are too many solutions! It’s too overwhelming, how many digital data collection solutions there are now. Some go so far as to decry the duplication of effort, to claim that the free market is inefficient or failing; they suggest that donors, consultants, or experts be put back in charge of resource allocation, to re-impose some semblance of sanity to the space.

But meanwhile, we’ve experienced a kind of golden age in terms of who can afford digital data collection technology, who can wield it effectively, and in what kinds of settings. There are a dizzying number of solutions – but most of them cater to a particular type of need, or have optimized their business model in a particular sort of way. Some, like us, rely nearly 100% on subscription revenues, others fund themselves more primarily from service provision, others are trying interesting ways to cross-subsidize from bigger, richer users so that they can offer free or low-cost options to smaller, poorer ones. We’ve overcome pilotitis, economies of scale are finally kicking in, and I think that the social benefits have been tremendous.

INTENSIFYING COMPETITION IN IT-ADOPTED PROFESSIONAL PRODUCTS

Digital data collection stage 3 (the coming days)

It was the success of the first stage that laid the foundation for the second stage, and so too it has been the success of the second stage that has laid the foundation for the third: precisely because digital data collection technology has become so affordable, accessible, and ubiquitous, organizations are increasingly thinking that it should be IT departments that procure and manage that technology.

Part of the motivation is the very proliferation of options that I mentioned above. While economics and the historical success of capitalism has taught us that a marketplace thriving with competition is most often a very good thing, it’s less clear that a wide variety of options is good within any single organization. At the very least, there are very good reasons to want to standardize some software and processes, so that different people and teams can more effortlessly share knowledge and collaborate, and so that there can be some economies of scale in training, support, and compliance.

Imagine if every team used its own product and file format for writing documents, for example. It would be a total disaster! The frictions across and between teams would be enormous. And as data becomes more and more core to the operations of more organizations – the way that digital documents became core many years ago – it makes sense to want to standardize and scale data systems, to streamline integrations, just for efficiency purposes.

Growing compliance needs only up the ante. The arrival of the EU’s General Data Protection Regulation (GDPR) this year, for example, raises the stakes for EU-based (or even EU-touching) organizations considerably, imposing stiff new data privacy requirements and steep penalties for violations. Coming into compliance with GDPR and other data-security regulations will be effectively impossible if IT can’t play a more active role in the procurement, configuration, and ongoing management of data systems; and it will be impractical for IT to play such a role for a vast array of constantly-shifting technologies. After all, IT will require some degree of stability and scale.

But if IT takes over digital data collection technology, what changes? Does the golden age come to an end?

Potentially. And there are certainly very good reasons to worry.

First, changing who controls the dollars – who’s in charge of procurement – threatens to entirely up-end the current regime, where end-users are directly in charge and their needs and preferences are catered to by a growing body of vendors eager to earn their business.

It starts with the procurement process itself. When IT is in charge, procurement processes are long, intensive, and tend to result in a “winner take all” contract. After all, it makes sense that IT departments would want to take their time and choose carefully; they tend to be choosing solutions for the organization as a whole (or at least for some large class of users within the organization), and they most often intend to choose a solution, invest heavily in it, and have it work for as long as possible.

This very natural and appropriate method that IT uses to procure is radically different from the method used by research, program, and M&E teams. And it creates a radically different dynamic for vendors.

Vendors first have to buy into the idea of investing heavily in these procurement processes – which some may simply choose not to do. Then they have to ask themselves, “what do these IT folks care most about?” In order to win these procurements, they need to understand the core concerns driving the purchasing decision. As in the old saying “nobody ever got fired for choosing IBM,” safety, stability, and reputation are likely to be very important. Compliance issues are likely to matter a lot too, including the vendor’s established ability to meet new and evolving standards. Integrations with corporate systems are likely to count for a lot too (e.g., integrating with internal data and identity-management systems).

Does it still matter how well the vendor meets the needs of end-users within the organization? Of course. But note the very important shift in the dynamic: vendors now have to get the IT folks to “yes” and so would be quite right to prioritize meeting their particular needs. Nobody will disagree that end-users ultimately matter, but meanwhile the focus will be on the decision-makers. The vendors that meet the decision-makers’ needs will live, the others will die. That’s simply one aspect of how a free market works.

Note also the subtle change in dynamic once a vendor wins a contract: the SaaS model where vendors had to re-earn every customer’s revenue month in, month out, is largely gone now. Even if the contract is formally structured as a subscription or has lots of exit options, the IT model for technology adoption is inherently stickier. There is a lot more lock-in in practice. Solutions are adopted, they’re invested in at large scale, and nobody wants to walk away from that investment. Innovation can easily slow, and nobody wants to repeat the pain of procurement and adoption in order to switch solutions.

And speaking of the pain of the procurement process: costs have been rising. After all, the procurement process itself is extremely costly to the vendor – especially when it loses, but even when it wins. So that’s got to get priced in somewhere. And then all of the compliance requirements, all of the integrations with corporate systems, all of that stuff’s really expensive too. What had been an inexpensive, flexible, off-the-shelf product can easily become far more expensive and far less flexible as it works itself through IT and compliance processes.

What had started out on a very positive note (“let’s standardize and scale, and comply with evolving data regulations”) has turned in a decidedly dystopian direction. It’s sounding pretty bad now, and you wouldn’t be wrong to think “wait, is this why a bunch of the products I use for work are so much more frustrating than the products I use as a consumer?” or “if Microsoft had to re-earn every user’s revenue for Excel, every month, how much better would it be?”

While I don’t think there’s anything wrong with the instinct for IT to take increasing control over digital data collection technologies, I do think that there’s plenty of reason to worry. There’s considerable risk that we lose the deep user orientation that has just been picking up momentum in the space.

WHERE WE’RE HEADED: STRIKING A BALANCE

Digital data collection stage 4 (finding a balance?)

If we don’t want to lose the benefits of a deep user orientation in this particular technology space, we will need to work pretty hard – and be awfully clever – to avoid it. People will say “oh, but IT just needs to consult research, program, and M&E teams, include them in the process,” but that’s hogwash. Or rather, it’s woefully inadequate. The natural power of those controlling resources to bend the world to their preferences and needs is just too powerful for mere consultation or inclusion to overcome.

And the thing is: what IT wants and needs is good. So the solution isn’t just “let’s not let them anywhere near this, let’s keep the end-users in charge.” No, that approach collapses under its own weight eventually, and certainly it can’t meet rising compliance requirements. It has its own weaknesses and inefficiencies.

What we need is an approach – a market structure – that allows the needs of IT and the needs of end-users both to matter to appropriate degrees.

With SurveyCTO, we’re currently in an interesting place: we’re becoming split between serving end-users and serving IT organizations. And I suppose as long as we’re split, with large parts of our revenue coming from each type of decision-maker, we remain incentivized to keep meeting everybody’s needs. But I see trouble on the horizon: the IT organizations can pay more, and more organizations are shifting in that direction… so once a large-enough proportion of our revenue starts coming from big, winner-take-all IT contracts, I fear that our incentives will be forever changed. In the language of economics, I think that we’re currently living in an unstable equilibrium. And I really want the next equilibrium to serve end-users as well as the last one!

M&E Squared: Evaluating M&E Technologies

by Roger Nathanial Ashby, Co-Founder & Principal Consultant, OpenWise

The universe of MERL Tech solutions has grown exponentially. In 2008 monitoring and evaluating tech within global development could mostly be confined to mobile data collection tools like Open Data Kit (ODK), and Excel spreadsheets to analyze and visualize survey data. In the intervening decade a myriad of tools, companies and NGOs have been created to advance the efficiency and effectiveness of monitoring, evaluation, research and learning (MERL) through the use of technology. Whether it’s M&E platforms or suites, satellite imagery, remote sensors, or chatbots, new innovations are being deployed every day in the field.

However, how do we evaluate the impact when MERL Tech is the intervention itself? That was the question and task put to participants of the “M&E Squared” workshop at MERL Tech 2017.

Workshop participants were separated into three groups that were each given a case study to discuss and analyze. One group was given a case about improving the learning efficiency of health workers in Liberia through the mHero Health Information System (HIS). The system was deployed as a possible remedy to some of the information communication challenges identified during the 2014 West African Ebola outbreak. A second group was given a case about the use of RapidPro to remind women to attend antenatal care (ANC) for preventive malaria medicine in Guinea. The USAID StopPalu project goal was to improve the health of infants by increasing the percent of women attending ANC visits. The final group was given a case about using remote images to assist East African pastoralists. The Satellite Assisted Pastoral Resource Management System (SAPARM) informs pastoralists of vegetation through remote sensing imagery so they can make better decisions about migrating their livestock.

After familiarizing ourselves with the particulars of the case studies, each group was tasked to present their findings to all participants after pondering a series of questions. Some of the issues under discussion included

(1) “How would you assess your MERL Tech’s relevance?”

(2) “How would you evaluate the effectiveness of your MERL Tech?”

(3) “How would you measure efficiency?” and

(4) “How will you access sustainability?”.

Each group came up with some innovative answers to the questions posed and our facilitators and session leads (Alexandra Robinson & Sutyajeet Soneja from USAID and Molly Chen from RTI) will soon synthesize the workshop findings and notes into a concise written brief for the MERL Tech community.

Before the workshop closed we were all introduced to the great work done by SIMLab (Social Impact Lab) in this area through their SIMLab Monitoring and Evaluation Framework. The framework identifies key criteria for evaluating M&E including:

  1. Relevance – The extent to which the technology choice is appropriately suited to the priorities and capacities of the context of the target group or organization.
  2. Effectiveness – A measure of the extent to which an information and communication channel, technology tool, technology platform, or a combination of these attains its objectives.
  3. Efficiency – Measure of the outputs (qualitative and quantitative) in relation to the inputs.
  4. Impact – The positive and negative changed produced by technology introduction, change in a technology tool, or platform on the overall development intervention (directly or indirectly; intended or unintended).
  5. Sustainability – Measure of whether the benefits of a technology tool or platform are likely to continue after donor funding has been withdrawn.
  6. Coherence – How related is the technology to the broader policy context (development, market, communication networks, data standards & interoperability mandates, and national & international law) within which the technology was developed and implemented.

While it’s unfortunate that SIMLab stopped most operations in early September 2017, their exceptional work in this and other areas lives on and you can access the full framework here.

I learned a great deal in this session from the facilitators and my colleagues attending the workshop. I would encourage everyone in the MERL Tech community to take the ideas generated during this workshop and the great work done by SIMLab into their development practice. We certainly intend to integrate much of these insights into our work at OpenWise. Read more about “The Evidence Agenda” here on SIMLab’s blog. 

 

 

 

Data quality in the age of lean data

by Daniel Ramirez-Raftree, MERL Tech support team.

Evolving data collection methods call for evolving quality assurance methods. In their session titled Data Quality in the Age of Lean Data, Sam Schueth of Intermedia, Woubedle Alemayehu of Oxford Policy Management, Julie Peachey of the Progress out of Poverty Index, and Christina Villella of MEASURE Evaluation discussed problems, solutions, and ethics related to digital data collection methods. [Bios and background materials here]

Sam opened the conversation by comparing the quality assurance and control challenges in paper assisted personal interviewing (PAPI) to those in digital assisted personal interviewing (DAPI). Across both methods, the fundamental problem is that the data that is delivered is a black box. It comes in, it’s turned into numbers and it’s disseminated, but in this process alone there is no easily apparent information about what actually happened on the ground.

During the age of PAPI, this was dealt with by sending independent quality control teams to the field to review the paper questionnaire that was administered and perform spot checks by visiting random homes to validate data accuracy. Under DAPI, the quality control process becomes remote. Survey administrators can now schedule survey sessions to be recorded automatically and without the interviewer’s knowledge, thus effectively gathering a random sample of interviews that can give them a sense of how well the sessions were conducted. Additionally, it is now possible to use GPS to track the interviewers’ movements and verify the range of households visited. The key point here is that with some creativity, new technological capacities can be used to ensure higher data quality.

Woubedle presented next and elaborated on the theme of quality control for DAPI. She brought up the point that data quality checks can be automated, but that this requires pre-survey-implementation decisions about what indicators to monitor and how to manage the data. The amount of work that is put into programming this upfront design has a direct relationship on the ultimate data quality.

One useful tool is a progress indicator. Here, one collects information on trends such as the number of surveys attempted compared to those completed. Processing this data could lead to further questions about whether there is a pattern in the populations that did or did not complete the survey, thus alerting researchers to potential bias. Additionally, one can calculate the average time taken to complete a survey and use it to identify outliers that took too little or too long to finish. Another good practice is to embed consistency checks in the survey itself; for example, making certain questions required or including two questions that, if answered in a particular way, would be logically contradictory, thus signaling a problem in either the question design or the survey responses. One more practice could be to apply constraints to the survey, depending on the households one is working with.

After this discussion, Julie spoke about research that was done to assess the quality of different methods for measuring the Progress out of Poverty Index (PPI). She began by explaining that the PPI is a household level poverty measurement tool unique to each country. To create it, the answers to 10 questions about a household’s characteristics and asset ownership are scored to compute the likelihood that the household is living below the poverty line. It is a simple, yet effective method to evaluate household level poverty. The research project Julie described set out to determine if the process of collecting data to create the PPI could be made less expensive by using SMS, IVR or phone calls.

Grameen Foundation conducted the study and tested four survey methods for gathering data: 1) in-person and at home, 2) in-person and away from home, 3) in-person and over the phone, and 4) automated and over the phone. Further, it randomized key aspects of the study, including the interview method and the enumerator.

Ultimately, Grameen Foundation determined that the interview method does affect completion rates, responses to questions, and the resulting estimated poverty rates. However, the differences in estimated poverty rates was likely not due to the method itself, but rather to completion rates (which were affected by the method). Thus, as long as completion rates don’t differ significantly, neither will the results. Given that the in-person at home and in-person away from home surveys had similar completion rates (84% and 91% respectively), either could be feasibly used with little deviation in output. On the other hand, in-person over the phone surveys had a 60% completion rate and automated over the phone surveys had a 12% completion rate, making both methods fairly problematic. And with this understanding, developers of the PPI have an evidence-based sense of the quality of their data.

This case study illustrates the the possibility of testing data quality before any changes are made to collection methods, which is a powerful strategy for minimizing the use of low quality data.

Christina closed the session with a presentation on ethics in data collection. She spoke about digital health data ethics in particular, which is the intersection of public health ethics, clinical ethics, and information systems security. She grounded her discussion in MEASURE Evaluation’s experience thinking through ethical problems, which include: the vulnerability of devices where data is collected and stored, the privacy and confidentiality of the data on these devices, the effect of interoperability on privacy, data loss if the device is damaged, and the possibility of wastefully collecting unnecessary data.

To explore these issues, MEASURE conducted a landscape assessment in Kenya and Tanzania and analyzed peer reviewed research to identify key themes for ethics. Five themes emerged: 1) legal frameworks and the need for laws, 2) institutional structures to oversee implementation and enforcement, 3) information systems security knowledge (especially for countries that may not have the expertise), 4) knowledge of the context and users (are clients comfortable with their data being used?), and 5) incorporating tools and standard operating procedures.

Based in this framework, MEASURE has made progress towards rolling out tools that can help institute a stronger ethics infrastructure. They’ve been developing guidelines that countries can use to develop policies, building health informatic capacity through a university course, and working with countries to strengthen their health information systems governance structures.

Finally, Christina explained her take on how ethics are related to data quality. In her view, it comes down to trust. If a device is lost, this may lead to incomplete data. If the clients are mistrustful, this could lead to inaccurate data. If a health worker is unable to check or clean data, this could create a lack of confidence. Each of these risks can lead to the erosion of data integrity.

Register for MERL Tech London, March 19-20th 2018! Session ideas due November 10th.

Six priorities for the MERL Tech community

by Linda Raftree, MERL Tech Co-organizer

IMG_4636Participants at the London MERL Tech conference in February 2017 crowdsourced a MERL Tech History timeline (which I’ve shared in this post). Building on that, we projected out our hopes for a bright MERL Tech Future. Then we prioritized our top goals as a group (see below). We’ll aim to continue building on these as a sector going forward and would love more thoughts on them.

  1. Figure out how to be responsible with digital data and not put people, communities, vulnerable groups at risk. Subtopics included: share data with others responsibly without harming anyone; agree minimum ethical standard for MERL and data collection; agree principles for minimizing data we collect so that only essential data is captured, develop duty of care principles for MERL Tech and digital data; develop ethical data practices and policies at organization levels; shift the power balance so that digital data convenience costs are paid by orgs, not affected populations; develop a set of quality standards for evaluation using tech
  2. Increase data literacy across the sector, at individual level and within the various communities where we are working.
  3. Overcome the extraction challenge and move towards true downward accountability. Do good user/human centered design and planning together, be ‘leaner’ and more user-focused at all stages of planning and MERL. Subtopics included: development of more participatory MERL methods; bringing consensus decision-making to participatory MERL; realizing the potential of tech to shift power and knowledge hierarchies; greater use of appreciative inquiry in participatory MERL; more relevant use of tech in MERL — less data, more empowering, less extractive, more used.
  4. Integrate MERL into our daily opfor blogerations to avoid the thinking that it is something ‘separate;’ move it to the core of operations management and make sure we have the necessary funds to do so; demystify it and make it normal! Subtopics included that: we’ve stopped calling “MERL” a “thing” and the norm is to talk about monitoring as part of operations; data use is enabling real-time coordination; no more paper based surveys.
  5. Improve coordination and interoperability as related to data and tools, both between organizations and within organizations. Subtopics included: more interoperability; more data-sharing platforms; all data with suitable anonymization is open; universal exchange of machine readable M&E Data (e.g., standards? IATI? a platform?); sector-wide IATI compliance; tech solutions that enable sharing of qualitative and quantitative data; systems of use across agencies; e.g., to refer feedback; coordination; organizations sharing more data; interoperability of tools. It was emphasized that donors should incentivize this and ensure that there are resources to manage it.
  6. Enhance user-driven and accessible tech that supports impact and increases efficiency, that is open source and can be built on, and that allows for interoperability and consistent systems of measurement and evaluation approaches.

In order to move on these priorities, participants felt we needed better coordination and sharing of tools and lessons among the NGO community. This could be through a platform where different innovations and tools are appropriately documented so that donors and organizations can more easily find good practice, useful tools and get a sense of ‘what’s out there’ and what it’s being used for. This might help us to focus on implementing what is working where, when, why and how in M&E (based on a particular kind of context) rather than re-inventing the wheel and endlessly pushing for new tools.

Participants also wanted to see MERL Tech as a community that is collaborating to shape the field and to ensure that we are a sector that listens, learns, and adopts good practices. They suggested hosting MERL Tech events and conferences in ‘the South;’ and building out the MERL Tech community to include greater representation of users and developers in order to achieve optimal tools and management processes.

What do you think – have we covered it all? What’s missing?

We have a data problem

by Emily Tomkys, ICT in Programmes at Oxfam GB

Following my presentation at MERL Tech, I have realised that it’s not only Oxfam who have a data problem; many of us have a data problem. In the humanitarian and development space, we collect a lot of data – whether via mobile phone or a paper process, the amount of data each project generates is staggering. Some of this data goes into our MIS (Management Information Systems), but all too often data remains in Excel spreadsheets on computer hard drives, unconnected cloud storage systems or Access and bespoke databases.

(Watch Emily’s MERL Tech London Lightning Talk!)

This is an issue because the majority of our programme data is analysed in silos on a survey-to-survey basis and at best on a project-to-project basis. What about when we want to analyse data between projects, between countries, or even globally? It would currently take a lot of time and resources to bring data together in usable formats. Furthermore, issues of data security, limited support for country teams, data standards and the cost of systems or support mean there is a sustainability problem that is in many people’s interests to solve.

The demand from Oxfam’s country teams is high – one of the most common requests the ICT in Programme Team receive centres around databases and data analytics. Teams want to be able to store and analyse their data easily and safely; and there is growing demand for cross border analytics. Our humanitarian managers want to see statistics on the type of feedback we receive globally. Our livelihoods team wants to be able to monitor prices at markets on a national and regional scale. So this motivated us to look for a data solution but it’s something we know we can’t take on alone.

That’s why MERL Tech represented a great opportunity to check in with other peers about potential solutions and areas for collaboration. For now, our proposal is to design a data hub where no matter what the type of data (unstructured, semi-structured or structured) and no matter how we collect the data (mobile data collection tools or on paper), our data can integrate into a database. This isn’t about creating new tools – rather it’s about focusing on the interoperability and smooth transition between tools and storage options.  We plan to set this up so data can be pulled through into a reporting layer which may have a mixture of options for quantitative analysis, qualitative analysis and GIS mapping. We also know we need to give our micro-programme data a home and put everything in one place regardless of its source or format and make it easy to pull it through for analysis.

In this way we can explore data holistically, spot trends on a wider scale and really know more about our programmes and act accordingly. Not only should this reduce our cost of analysis, we will be able to analyse our data more efficiently and effectively. Moreover, taking a holistic view of the data life cycle will enable us to do data protection by design and it will be easier to support because the process and the tools being used will be streamlined. We know that one tool does not and cannot do everything we require when we work in such vast contexts, so a challenge will be how to streamline at the same time as factoring in contextual nuances.

Sounds easy, right? We will be starting to explore our options and working on the datahub in the coming months. MERL Tech was a great start to make connections, but we are keen to hear from others about how you are approaching “the data problem” and eager to set something up which can also be used by other actors. So please add your thoughts in the comments or get in touch if you have ideas!

Dropping down your ignorance ratio: Campaigns meet KNIME

by Rodrigo Barahona (Oxfam Intermon, @rbarahona77) and Enrique Rodriguez (Consultant, @datanauta)

A few year ago, we ran a Campaign targeting the Guatemalan Government, which generated a good deal of global public support (100,000 signatures, online activism, etc.). This, combined with other advocacy strategies, finally pushed change to happen. We did an evaluation in order to learn from such a success and found a key area where there was little to learn because we were unable to get and analyze the information:  we knew almost nothing about which online channels drove traffic to the online petition and which had better conversion rates. We didn’t know the source of more than 80% of our signatures, so we couldn’t establish recommendations for future similar actions

Building on the philosophy underneath Vanity Metrics, we started developing a system to evaluate public engagement as part of advocacy campaigns and spike actions. We wanted to improve our knowledge on what works and what doesn’t on mobilizing citizens to take action (mostly signing petitions or other online action), and which were the most effective channels in terms of generating traffic and converting petitions. So we started implementing a relatively simple Google Analytics Tracking system that helped us determine the source of the visit/signatures, establish conversion rates, etc. The only caveat was that it was time consuming — the extraction of the information and its analysis was mostly manual.

Later on, we were asked to implement the methodology on a complex campaign that had 3 landing/petition pages, 3 exit pages, and all this in two different languages. Our preliminary analysis was that it would take us up to 8-10 hours of work, with high risk of mistakes as it needed cross analysis of up to 12 pages, and required distinguishing among more than 15 different sources for each page.

But then we met KNIME: an Information Miner tool that helped us to extract different sets of data from Google analytics (through plugins), create the data flow in a visual way and automatically execute part of the analysis. So far, we have automated the capture and analysis of statistics of web traffic (Google Analytics), the community of users on Twitter and the relevance of posts in that social network. We’ve been able to minimize the risk of errors, focus on the definition of new indicators and visualizations and provide reports to draw conclusions and design new communication strategies (based on data) in a very short period of time.

KNIME helped us to scale up our evaluation system, making it suitable for very complex campaigns, with a significant reduction of time dedication and also lowering the risk of mistakes. And most important of all, introducing KNIME into our system has dropped down our ignorance ratio significantly, because nowadays we can identify the source of more than 95% of the signatures. This means that we can shed light on how different strategies are working, which channels are bringing more visits to the different landing pages, and which have the higher conversion rate. All this is relevant information to inform decisions and adapt strategies and improve the outputs of a campaign.

Watch Rodrigo’s MERL Tech Lightning Talk here!