Building on MERL Tech London 2017, we will engage 200 practitioners from across the development and technology ecosystems for a two-day conference seeking to turn the theories of MERL technology into effective practice that delivers real insight and learning in our sector.
MERL Tech London 2018
Digital data and new media and information technologies are changing MERL practices. The past five years have seen technology-enabled MERL growing by leaps and bounds, including:
Adaptive management and ‘developmental evaluation’
Faster, higher quality data collection.
Remote data gathering through sensors and self-reporting by mobile.
Big Data and social media analytics
Alongside these new initiatives, we are seeing increasing documentation and assessment of technology-enabled MERL initiatives. Good practice guidelines and new frameworks are emerging and agency-level efforts are making new initiatives easier to start, build on and improve.
The swarm of ethical questions related to these new methods and approaches has spurred greater attention to areas such as responsible data practice and the development of policies, guidelines and minimum ethical frameworks and standards for digital data.
Like previous conferences, MERL Tech London will be a highly participatory, community-driven event and we’re actively seeking practitioners in monitoring, evaluation, research, learning, data science and technology to facilitate every session.
Innovations: Brand new, untested technologies or approaches and their application to MERL(Tech)
Debates: Lively discussions, big picture conundrums, thorny questions, contentious topics related to MERL Tech
Management: People, organizations, partners, capacity strengthening, adaptive management, change processes related to MERL Tech
Evaluating MERL Tech: comparisons or learnings about MERL Tech tools/approaches and technology in development processes
Failures: What hasn’t worked and why, and what can be learned from this?
Demo Tables: to share MERL Tech approaches, tools, and technologies
Other topics we may have missed!
Session Submission Deadline: Friday, November 10, 2017.
Session leads receive priority for the available seats at MERL Tech and a discounted registration fee. You will hear back from us in early December and, if selected, you will be asked to submit an updated and final session title, summary and outline by Friday, January 19th, 2018.
Please register to attend, or reserve a demo table for MERL Tech London 2018 to examine these trends with an exciting mix of educational keynotes, lightning talks, and group breakouts, including an evening Fail Festival reception to foster needed networking across sectors.
We are charging a modest fee to better allocate seats and we expect to sell out quickly again this year, so buy your tickets or demo tables now. Event proceeds will be used to cover event costs and to offer travel stipends for select participants implementing MERL Tech activities in developing countries.
The session discussed the tension between the need for enterprise data management solutions that can be used across the entire organization and solutions that meet the needs of specific projects. The three of us presented our lessons learned on this topic from our respective roles as M&E advisor, M&E software provider, and programimplementer.
We then asked attendees to provide a list of their top do’s and don’ts related to their own experiences – and then reviewed the feedback to identify key themes.
Here’s a rundown on the themes that emerged from participants’ feedback:
Think of these as systems broadly—not M&E specific. For example: How do HR practices affect technology adoption? Does your organization have a federated structure that makes standard indicator development difficult? Do you require separate reporting for management and donor partners? These are all organizational systems that need to be properly considered before system selection and implementation. Top takeaways from the group include these insights to help you ensure your implementation goes smoothly:
1. Form Follows Function: This seems like an obvious theme, but since we received so much feedback about folks’ experiences, it bears repeating: define your goals and purpose first, then design a system to meet those, not the other way around. Don’t go looking for a solution that doesn’t address an existing problem. This means that if the ultimate goal for a system is to improve field staff data collection, don’t build a system to improve data visualization.
2. HR & Training: One of the areas our industry seems to struggle with is long-term capacity building and knowledge transfer around new systems. Suggestions in this theme were that training on information systems become embedded in standard HR processes with ongoing knowledge sharing and training of field staff, and putting a priority on hiring staff with adequate skill mixes to make use of information systems.
3. Right-Sized Deployment for Your Organization: There were a number of horror stories around organizations that tried to implement a single system simultaneously across all projects and failed because they bit off more than they could chew, or because the selected tool really didn’t meet a majority of their organization’s projects’ needs. The general consensus here was that small pilots, incremental roll-outs, and other learn-and-iterate approaches are a best practice. As one participant put it: Start small, scale slowly, iterate, and adapt.
We wanted to get feedback on best and worst practices around M&E system implementations specifically—how tools should be selected, necessary planning or analysis, etc.
4. Get Your M&E Right: Resoundingly, participants stressed that a critical component of implementing an M&E information system is having well-organized M&E, particularly indicators. We received a number of comments about creating standardized indicators first, auditing and reconciling existing indicators, and so on.
5. Diagnose Your Needs: Participants also chorused the need for effective diagnosis of the current state of M&E data and workflows and what the desired end-state is. Feedback in this theme focused on data, process, and tool audits and putting more tool-selection power in M&E experts’ hands rather than upper management or IT.
6. Scope It Out: One of the flaws each of us has seen in our respective roles is having too generalized or vague of a sense of why a given M&E tool is being implemented in the first place. All three of us talked about the need to define the problem and desired end state of an implementation. Participants’ feedback supported this stance. One of the key takeaways from this theme was to define who the M&E is actually for, and what purpose it’s serving: donors? Internal management? Local partner selection/management? Public accountability/marketing?
The first two categories are more about the how and why of system selection, roll-out, and implementation. This category is all about working to define and articulate what any type of system needs to be able to do.
7. UX Matters: It seems like a lot of folks have had experience with systems that aren’t particularly user-friendly. We received a lot of feedback about consulting users who actually have to use the system, building the tech around them rather than forcing them to adapt, and avoiding “clunkiness” in tool interfaces. This feels obvious but is, in fact, often hard to do in practice.
8. Keep It Simple, Stupid: This theme echoed the Right-Sized Deployment for Your Organization: take baby steps; keep things simple; prioritize the problems you want to solve; and don’t try to make a single tool solve all of them at once. We might add to this: many organizations have never had a successful information system implementation. Keeping the scope and focus tight at first and getting some wins on those roll-outs will help change internal perception of success and make it easier to implement broader, more elaborate changes long-term.
9. Failing to Plan Is Planning to Fail: The consensus in feedback was that it pays to take more time upfront to identify user/system needs and figure out which are required and which are nice to have. If interoperability with other tools or systems is a requirement, think about it from day one. Work directly with stakeholders at all levels to determine specs and needs; conduct internal readiness assessments to see what the actual needs are; and use this process to identify hierarchies of permissions and security.
Last, but not least, there’s how systems will be introduced and rolled out to users. We got the most feedback on this section and there was a lot of overlap with other sections. This seems to be the piece that organizations struggle with the most.
10. Get Buy-in/Identify Champions: Half the feedback we received on change management revolved around this theme. For implementations to be successful, you need both a top-down approach (buy-in from senior leadership) and a bottom-up approach (local champions/early adopters). To help facilitate this buy-in, participants suggested creating incentives (especially for management), giving local practitioners ownership, including programs and operations in the process, and not letting the IT department lead the initiative. The key here is that no matter which group the implementation ultimately benefits the most, having everyone on the same page understanding the implementation goals and why the organization needs it are key.
11. Communicate: Part of how you get buy-in is to communicate early and often. Communicate the rationale behind why tools were selected, what they’re good—and bad—at, what the value and benefits of the tool are, and transparency in the roll-out/what it hopes to achieve/progress towards those goals. Consider things like behavior change campaigns, brown bags, etc.
12. Shared Vision: This is a step beyond communication: merely telling people what’s going on is not enough. There must be a larger vision of what the tool/implementation is trying to achieve and this, particularly, needs to be articulated. How will it benefit each type of user? Shared vision can help overcome people’s natural tendencies to resist change, hold onto “their” data, or cover up failures or inconsistencies.
by Maliha Khan, a development practitioner in the fields of design, measurement, evaluation and learning. Maliha led the Maturity Model sessions at MERL Tech DC and Linda Raftree, independent consultant and lead organizer of MERL Tech.
MERL Tech is a platform for discussion, learning and collaboration around the intersection of digital technology and Monitoring, Evaluation, Research, and Learning (MERL) in the humanitarian and international development fields. The MERL Tech network is multidisciplinary and includes researchers, evaluators, development practitioners, aid workers, technology developers, data analysts and data scientists, funders, and other key stakeholders.
One key goal of the MERL Tech conference and platform is to bring people from diverse backgrounds and practices together to learn from each other and to coalesce MERL Tech into a more cohesive field in its own right — a field that draws from the experiences and expertise of these various disciplines. MERL Tech tends to bring together six broad communities:
traditional M&E practitioners, who are interested in technology as a tool to help them do their work faster and better;
development practitioners, who are running ICT4D programs and beginning to pay more attention to the digital data produced by these tools and platforms;
business development and strategy leads in organizations who want to focus more on impact and keep their organizations up to speed with the field;
tech people who are interested in the application of newly developed digital tools, platforms and services to the field of development, but may lack knowledge of the context and nuance of that application
data people, who are focused on data analytics, big data, and predictive analytics, but similarly may lack a full grasp of the intricacies of the development field
donors and funders who are interested in technology, impact measurement, and innovation.
Since our first series of Technology Salons on ICT and M&E in 2012 and the first MERL Tech conference in 2014, the aim has been to create stronger bridges between these diverse groups and encourage the formation of a new field with an identity of its own — In other words, to move people beyond identifying as, say, an “evaluator who sometimes uses technology,” and towards identifying as a member of the MERL Tech space (or field or discipline) with a clearer understanding of how these various elements work together and play off one another and how they influence (and are influenced by) the shifts and changes happening in the wider ecosystem of international development.
By building and strengthening these divergent interests and disciplines into a field of their own, we hope that the community of practitioners can begin to better understand their own internal competencies and what they, as a unified field, offered to international development. This is a challenging prospect, as beyond their shared use of technology to gather, analyze, and store data and an interest in better understanding how, when, why, where, (etc.) these tools work for MERL and for development/humanitarian programming, there aren’t many similarities between participants.
At the MERL Tech London and MERL Tech DC conferences in 2017, we made a concerted effort to get to the next level in the process of creating a field. In London in February, participants created a timeline of technology and MERL and identified key areas that the MERL Tech community could work on strengthening (such as data privacy and security frameworks and more technological tools for qualitative MERL efforts). At MERL Tech DC, we began trying to understand what a ‘maturity model’ for MERL Tech might look like.
What do we mean by a ‘maturity model’?
Broadly, maturity models seek to qualitatively assess people/culture, processes/structures, and objects/technology to craft a predictive path that an organization, field, or discipline can take in its development and improvement.
Initially, we considered constructing a “straw” maturity model for MERL Tech and presenting it at the conference. The idea was that our straw model’s potential flaws would spark debate and discussion among participants. In the end, however, we decided against this approach because (a) we were worried that our straw model would unduly influence people’s opinions, and (b) we were not very confident in our own ability to construct a good maturity model.
Instead, we opted to facilitate a creative space over three sessions to encourage discussion on what a maturity model might look like, and what it might contain. Our vision for these sessions was to get participants to brainstorm in mixed groups containing different types of people- we didn’t want small subsets of participants to create models independently without the input of others.
In the first session, “Developing a MERL Tech Maturity Model”, we invited participants to consider what a maturity model might look like. Could we begin to imagine a graphic model that would enable self-evaluation and allow informed choices about how to best develop competencies, change and adjust processes and align structures in organizations to optimize using technology for MERL or indeed other parts of the development field?
In the second session, “Where do you sit on the Maturity Model?” we asked participants to use the ideas that emerged from our brainstorm in the first session to consider their own organizations and work, and compare them against potential maturity models. We encouraged participants to assess themselves using green (young sapling) to yellow (somewhere in the middle) and red (mature MERL Tech ninja!) and to strike up a conversation with other people in the breaks on why they chose that color.
In our third session, “Something old, something new”, we consolidated and synthesized the various concepts participants had engaged with throughout the conference. Everyone was encouraged to reflect on their own learning, lessons for their work, and what new ideas or techniques they may have picked up on and might use in the future.
The Maturity Models
As can be expected, when over 300 people take marker and crayons to paper, many a creative model emerges. We asked the participants to gallery walk the models over the next day during the breaks and vote on their favorite models.
We won’t go into detail of what all the 24 the models showed, but there were some common themes that emerged from the ones that got the most votes – almost all maturity models include dimensions (elements, components) and stages, and a depiction of potential progression from early stages to later stages across each dimension. They all also showed who the key stakeholders or players were, and some had some details on what might be expected of them at different stages of maturity.
Two of the models (MERLvana and the Data Appreciation Maturity Model – DAMM) depicted the notion that reaching maturity was never really possible and the process was an almost infinite loop. As the presenters explained MERLvana “it’s an impossible to reach the ideal state, but one must keep striving for it, in ever closer and tighter loops with fewer and fewer gains!”
“MERL-tropolis” had clearly defined categories (universal understanding, learning culture and awareness, common principles, and programmatic strategy) and the structures/ buildings needed for those (staff, funding, tools, standard operating procedures, skills).
The most popular was “The Data Turnpike” which showed the route from the start of “Implementation with no data” to the finish line of “Technology, capacity and interest in data and adaptive management” with all the pitfalls along the way (misuse, not timely, low ethics etc) marked to warn of the dangers.
As organizers of the session, we found the exercises both interesting and enlightening, and we hope they helped participants to begin thinking about their own MERL Tech practice in a more structured way. Participant feedback on the session was on polar extremes. Some people loved the exercise and felt that it allowed them to step back and think about how they and their organization were approaching MERL Tech and how they could move forward more systematically with building greater capacities and higher quality work. Some told us that they left with clear ideas on how they would work within their organizations to improve and enhance their MERL Tech practice, and that they had a better understanding of how to go about that. A few did not like that we had asked them to “sit around drawing pictures” and some others felt that the exercise was unclear and that we should have provided a model instead of asking people to create one. [Note: This is an ongoing challenge when bringing together so many types of participants from such diverse backgrounds and varied ways of thinking and approaching things!]
We’re curious if others have worked with “maturity models” and if they’ve been applied in this way or to the area of MERL Tech before. What do you think about the models we’ve shared? What is missing? How can we continue to think about this field and strengthen our theory and practice? What should we do at MERL Tech London in March 2018 and beyond to continue these conversations?
by Amanda Makulec, Data Visualization Lead, Excella Consulting and Barb Knittel, Research, Monitoring & Evaluation Advisor, John Snow Inc. Amanda and Barb led “How the Simpsons Make Data Use Happen” at MERL Tech DC.
Workshopping ways to make data use happen.
Human centered design isn’t a new concept. We’ve heard engineers, from aerospace to software, quietly snicker as they’ve seen the enthusiasm for design thinking explode within the social good space in recent years. “To start with the end user in mind? Of course! How else would you create a product someone wants to use?”
However, in our work designing complex health information systems, dashboards, and other tools and strategies to improve data use, the idea of starting with the end user does feel relatively new.
Thinking back to graduate school nearly ten years ago, dashboard design classes focused on the functional skills, like how to use a pivot table in Excel, not on the complex processes of gathering user requirements to design something that could not only delight the end user, but be co-designed with them.
As part of designing for data use and data visualization design workshops, we’ve collaborated with design firms to find new ways to crack the nut of developing products and processes that help decisionmakers use information. Using design thinking tools like ranking exercises, journey maps, and personas has helped users identify and find innovative ways to address critical barriers to data use.
If you’re thinking about integrating design thinking approaches into data-centered projects, here are our five key considerations to take into account before you begin:
Design thinking is a mindset, not a workshop agenda. When you’re setting out to incorporate design thinking into your work, consider what that means throughout the project lifecycle. From continuous engagement and touchpoints with your data users to
Engage the right people – you need a diverse range of perspectives and experiences to uncover problems and co-create solutions. This means thinking of the usual stakeholders using the data at hand, but also engaging those adjacent to the data. In health information systems, this could be the clinicians reporting on the registers, the mid-level managers at the district health office, and even the printer responsible for distributing paper registers.
Plan for the long haul. Don’t limit your planning and projections of time, resources, and end user engagement to initial workshops. Coming out of your initial design workshops, you’ll likely have prototypes that require continued attention to functionally build and implement.
Focus on identifying and understanding the problem you’ll be solving. You’ll never be able to solve every problem and overcome every data use barrier in one workshop (or even in one project). Work with your users to develop a specific focus and thoroughly understand the barriers and challenges from their perspectives so you can tackle the most pressing issues (or choose deliberately to work on longer term solutions to the largest impediments).
The journey matters as much as the destination. One of the greatest ah-ha moments coming out of these workshops has been from participants who see opportunities to change how they facilitate meetings or manage teams by adopting some of the activities and facilitation approaches in their own work. Adoption of the prototypes shouldn’t be your only metric of success.
The Designing for Data Use workshops were funded by (1) USAID and implemented by the MEASURE Evaluation project and (2) the Global Fund through the Data Use Innovations Fund. Matchboxology was the design partner for both sets of workshops, and John Snow Inc. was the technical partner for the Data Use Innovations sessions. Learn more about the process and learning from the MEASURE Evaluation workshops in Applying User Centered Design to Data Use Challenges: What we Learned and see our slides from our MERL Tech session “The Simpsons, Design, and Data Use” to learn more.
We’ll be experimenting with a monthly round-up of MERL Tech related content (bi-weekly if there’s enough to fill a post). Let us know if it’s useful! We aim to keep it manageable and varied, rather than a laundry list of every possible thing. The format, categories, and topics will evolve as we see how it goes and what the appetite is.
If you have anything you’d like to share or see featured, feel free to send it on over or post on Twitter using the #MERLTech hashtag.
Series on data management (DAI) covering 1) planning and collecting data; 2) managing and storing data; and 3) getting value and use out of the data that’s collected through analysis and visualization.
By Ambika Samarthya-Howard, Head of Communications at Praekelt.org. This post also appears on the Praekelt.org blog.
Attending conferences often reminds me of dating: you put your best foot forward and do yourself up, and hide the rest for a later time. I always found it refreshing when people openly put their cards on the table.
I think that’s why I especially respected and enjoyed my experience at MERL Tech in DC last week. One of the first sessions I went to was a World Cafe style break out exploring how to be inclusive in M&E tech in the field. The organisations, like Global Giving and Keystone, posed hard questions about community involvement in data collection at scale, and how to get people less familiar or with less access to technology involved in the process. They didn’t have any of the answers. They wanted to learn from us.
This was followed by lightning talks after lunch where organisations gave short talks. One organisation spoke very openly about how much money and time they were wasting on their data collection technologies. Another organisation confessed their lack of structure and organisation, and talked about collaborating with organisations like DataKind to make sense of their data. Anahi Ayala Iacucci from Internews did a presentation on the pitfalls and realities of M&E: “we all skew the results in order to get the next round of funding.” She fondly asked us to work together so we could “stop farting in the wind”. D-Tree International spoke about a trial around transport vouchers for pregnant women in Zanzibar, and how the control group that did not receive any funding actually did better. They had to stop funding the vouchers.
The second day I attended an entire session where we looked at existing M&E reports available online to critique their deficiencies and identify where the field was lacking in knowledge dissemination. As a Communications person, looking at the write-ups of the data ironically gave me instant insight into ways forward and where gaps could be filled — which I believe is exactly what the speakers of the session intended. When you can so clearly see why and how things aren’t working, it actually inspires a different approach and way of working.
I was thoroughly impressed with the way people shared at MERL Tech. When you see an organisation able to talk so boldly about its learning curves or gaps, you respect their work, growth, and learnings. And that is essentially the entire purpose of a conference.
Back to dating… and partnerships. Sooner or later, if the relationship works out, your partner is going to see you in the a.m. for who you really are. Why not cut to the chase, and go in with your natural look? Then you can take the time to really do any of the hard work together, on the same footing.
In their Your Word Counts project, Oxfam is collaborating with local and global partners to capture, analyze, and respond to community feedback data using a mobile case management tool. The goal is to inform Oxfam’s Middle East humanitarian response and give those affected by crisis a voice for improved support and services. This project is a scale-up of an earlier pilot project, and both the pilot and the scale-up have been supported by the Humanitarian Innovation Fund.
Oxfam’s use of SurveyCTO’s case-management features has been innovative, and they have been helping to support improvements in the core technology. In this session, we discussed both the core technology and the broader organizational and logistical challenges that Oxfam has encountered in the field.
Mobile case management: an introduction
In standard applications of mobile data collection, enumerators, inspectors, program officers, or others use a mobile phone or tablet to collect data. Whether they quietly observe things, interview people, or inspect facilities, they ultimately enter some kind of data into a mobile device. In systems like SurveyCTO, data-collection officially begins when they click a Fill Blank Formbutton and choose a digital form to fill out.
Mobile case management is much the same, but the process begins with cases and then proceeds to forms. As far as the core technology is concerned, a case might be a clinic, a school, a water point, a household – pretty much any unit that’s meaningful in the given context. Instead of choosing Fill Blank Form and choosing a form, users in the field choose Manage Cases and then choose a particular case from a list that’s filtered specifically for that user (e.g., to include only schools in their area); once they select a case, they then select one of the forms that is outstanding for that case.
Behind the scenes, the case list is really just a spreadsheet. It includes columns for the unique case ID, the label that should be used to identify the case to users, the list of forms that should be filled for the case, and the users and/or user roles that should see the case listed in their case list. Importantly, the case list is not static: any form can update or add a case, and thus as users fill forms the case list can be dynamically revised and extended. (In SurveyCTO, the case list is simply a server dataset: it can be manually uploaded as a .csv, attached to forms, and updated just like any other dataset.)
In Oxfam’s Your Word Counts project, cases represent any kind of feedback from the community. Volunteers and program staff carry mobile phones and log feedback as new cases whenever they interact with community members; technical teams then work to resolve feedback within a week, filling out new forms to update cases as their status changes; and program staff then close the loop with the original community members when possible, before closing the case. Because the data is all available in a single electronic system, in-country, regional, and even global teams can then report on and analyze both the community feedback and the responses over time.
There have been some definite successes in piloting and early scale-up:
By listening to community members, recording their feedback, and following up, the community feedback system has helped to build trust.
The digital process of recording referrals, updates, and eventually responses has been rapid, speeding responsiveness to feedback overall.
Since all digital forms can be updated easily, the system is dynamic and flexible enough to adapt as programs or needs change.
The solution appears to be low-cost, scalable, and sustainable.
There have been both organizational and logistical challenges, however. For example:
For a system like this to truly be effective, fundamental responsibility for accountability must be shared organization-wide. While MEAL officers (monitoring, evaluation, accountability, and learning officers) can help to set up and manage accountability systems, technical teams, program teams, and senior leadership ultimately have to share ownership and responsibility in order for the system to function and sustain.
Globally-predefined feedback categories turned out not to fit well with early deployment contexts, and so the program team needed to re-think how to most effectively categorize feedback. (See Oxfam’s blog post on the subject.)
In dynamic in-country settings, staff turnover can be high, posing major logistical and sustainability challenges for systems of all kinds.
While community members can add and update cases offline, ultimately an Internet connection is required to synchronize case lists with a central server. In some settings, access to office Internet has been a challenge.
Ideally, cases would be easily referred across agencies working in a particular setting, but some agencies have been reluctant to buy into shared digital systems.
Oxfam’s MEAL team is exploring ways to facilitate a broader accountability culture throughout the organization. In country programs, for example, MEAL coordinators are looking to use office whiteboards to track key indicators of feedback performance and engage staff in discussions of what those indicators mean for them. More broadly, Oxfam is looking to highlight best practices in responding and acting on feedback and seeking other ways to incentivize teams in this area.
While Oxfam works to build and support both systems and culture for accountability in their humanitarian response programs, we at Dobility are working to improve the core technology. With Oxfam’s feedback and support, we are currently working to improve the user interface used to filter and browse case lists, both on devices (in the field) and on the web (in the office). We are also working to improve the user interface for those setting up and managing these kinds of case-management system. If you have specific ideas, please share them by commenting below!
How can we assess progress on a second-generation way of working in the transparency, accountability and participation (TAP) field? Monitoring, evaluation, research, and learning (MERL) maturity models can provide some inspiration. The 2017 MERL Tech conference in Washington, DC was a two-day bonanza of lightening talks, plenary sessions, and hands-on workshops among participants who use technology for MERL.
Here are key conference takeaways from two MEL practitioners in the TAP field.
1. Making open data useful
Several MERL Tech sessions resonated deeply with the TAP field’s efforts to transition from fighting for transparent and open data towards linking open data to accountability and governance outcomes. Development Gateway and InterAction drew on findings from “Avoiding Data Graveyards” as we explored progress and challenges for International Aid Transparency Initiative (IATI) data use cases. While transparency is an important value, what is gained (or lost) in data use for collaboration when there are many different potential data consumers?
And finally, as TAP practitioners are keenly aware, power and politics can overshadow evidence in decision making. Another Development Gateway presentation reminded us that it is important to work with data producers and users to identify decisions that are (or can be) data-driven, and to recognize when other factors are driving decisions. (The incentives to supply open data is whole other can of worms!)
Drawing on our second-generation TAP approach, more work is needed for the TAP and MERL fields to move from “open data everywhere, all of the time” to planning for, and encouraging more effective data use.
2. Tech for MERL for improved policy, practice, and outcomes
Among our favorite moments at MERL Tech was when Dobility Founder and CEO Christopher Robert remarked that “the most interesting innovations at MERL Tech aren’t the new, cutting-edge technology developments, but generic technology applied in innovative ways.” Unsurprising for a tech company focused on using affordable technology to enable quality data collection for social impact, but a refreshing reminder amidst the talk of ‘AI’, ‘chatbots’, and ‘blockchains’ for development coursing through the conference.
We are undoubtedly on the precipice of revolutionary technological advancements that can be readily (and maybe even affordably) deployed to solve complex global challenges, but they will still be tools and not solutions.
3. Balancing inclusion and participation with efficiency and accuracy
We explored a constant conundrum for MERL: how to balance inclusion and participation with efficiency and accuracy. Girl Effect and Praekelt Foundation took “mixed methods” research to another level, combining online and offline efforts to understand user needs of adolescent girls and to support user-developed digital media content. Their iterative process showcased an effective way to integrate tech into the balancing act of inclusive – and holistic – design, paired with real-time data use.
This session on technology in citizen generated data brought to light two case studies of how tech can both help and hinder this balancing act. The World Café discussions underscored the importance of planning for – and recognizing the constraints on – feedback loops. And provided us a helpful reminder that MERL and tech professionals are often considering different “end users” in their design work!
So, which is it – balancing act or zero-sum game between inclusion and efficiency? The MERL community has long applied participatory methods. And tech solutions abound that can help with efficiency, accuracy, and inclusion. Indeed, the second-generation TAP focus on learning and collaboration is grounded in effective data use – but there are many potential “end users” to consider. These principles and practices can force uncomfortable compromises – particularly in the face of finite resources and limited data availability – but they are not at odds with each other. Perhaps the MERL and TAP communities can draw lessons from each other in striking the right balance.
4. Tech sees no development sector silos
One of the things that makes MERL Tech such an exciting conference, is the deliberate mixing of tech nerds with MERL nerds. It’s pretty unique in its dual targeting of both types of professionals who share a common purpose of social impact (where as conferences like ICT4D cast a wider net looking at application of technology to broader development issues). And, though we MERL professionals like to think of design and iteration as squarely within our wheelhouse, being in a room full of tech experts can quickly remind you that our adaptation game has a lot of room to grow. We talk about user-centered design in TAP, but when the tech crowd was asked in plenary “would you even think of designing software or an app without knowing who was going to use it?” they responded with a loud and exuberant laugh.
Tech has long employed systematic approaches to user-centered design, prototyping, iteration, and adaptation, all of which can offer compelling lessons to guide MERL practices and methods. Though we know Context is King, it is exhilarating to know that the tech specialists assembled at the conference work across traditional silos of development work (from health to corruption, and everything in between). End users are, of course, crucial to the final product but the life cycle process and steps follow a regular pattern, regardless of the topic area or users.
The second-generation wave in TAP similarly moved away from project-specific, fragmented, or siloed planning and learning towards a focus on collective approaches and long-term, more organic engagement.
American Evaluation Association President, Kathy Newcomer, quipped that maybe an ‘Academy Awards for Adaptation’ could inspire better informed and more adept evolutions to strategy as circumstances and context shift around us. Adding to this, and borrowing from the tech community, we wonder where we can build more room to beta test, follow real demand, and fail fast. Are we looking towards other sectors and industries enough or continuing to reinvent the wheel?
Alison left thinking:
Concepts and practices are colliding across the overlapping MERL, tech, and TAP worlds! In leading the Transparency and Accountability Initiative’s learning strategy, and supporting our work on data use for accountability, I often find myself toggling between different meanings of ‘data’, ‘data users’, and tech applications that can enable both of these themes in our strategy. These worlds don’t have to be compatible all the time, and concepts don’t have to compute immediately (I am personally still working out hypothetical blockchain applications for my MERL work!). But this collision of worlds is a helpful reminder that there are many perspectives to draw from in tackling accountable governance outcomes.
Maturity models come in all shapes and sizes, as we saw in the creative depictions created at MERL Tech that included, steps, arrows, paths, circles, cycles, and carrots! And the transparency and accountability field is collectively pursuing a next generation of more effective practice that will take unique turns for different accountability actors and outcomes. Regardless of what our organizational or programmatic models look like, MERL Tech reminded me that champions of continuous improvement are needed at all stages of the model – in MERL, in tech for development, and in the TAP field.
Megan left thinking:
That I am beginning to feel like I’m a Dr. Seuss book. We talked ‘big data’, ‘small data’, ‘lean data’, and ‘thick data’. Such jargon-filled conversations can be useful for communicating complex concepts simply with others. Ah, but this is also the problem. This shorthand glosses over the nuances that explain what we actually mean. Jargon is also exclusive—it clearly defines the limits of your community and makes it difficult for newcomers. In TAP, I can’t help but see missed opportunities for connecting our work to other development sectors. How can health outcomes improve without holding governments and service providers accountable for delivering quality healthcare? How can smallholder farmers expect better prices without governments budgeting for and building better roads? Jargon is helpful until it divides us up. We have collective, global problems and we need to figure out how to talk to each other if we’re going to solve them.
In general, I’m observing a trend towards organic, participatory, and inclusive processes—in MERL, in TAP, and across the board in development and governance work. This is, almost universally speaking, a good thing. In MERL, a lot of this movement is a backlash to randomistas and imposing The RCT Gold Standard to social impact work. And, while I confess to being overjoyed that the “RCT-or-bust” mindset is fading out, I can’t help but think we’re on a slippery slope. We need scientific rigor, validation, and objective evidence. There has to be a line between ‘asking some good questions’ and ‘conducting an evaluation’. Collectively, we are working to eradicate unjust systems and eliminate poverty, and these issues require not just our best efforts and intentions, but workable solutions. Listen to Freakonomic’s recent podcast When Helping Hurts and commit with me to find ways to keep participatory and inclusive evaluation techniques rigorous and scientific, too.
In Part 1 of this series we argued that, while applications of big data and data analytics are expanding rapidly in many areas of development programs, evaluators have been slow to adopt these applications. We predicted that one possible future scenario could be that evaluation may no longer be considered as a separate function, and that it may be treated as one of the outputs of the integrated information systems that will gradually be adopted by many development agencies. Furthermore, many evaluations will use data analytics approaches, rather than conventional evaluation designs. (Image: Big Data session notes from USAIDLearning’s Katherine Haugh [@katherine_haugh}. MERL Tech DC 2016).
Here, in Part 2 we identify some of the reasons why development evaluators have been slow to adopt big data analytics and we propose some promising approaches for building bridges between evaluators and data analysts.
Why have evaluators been slow to adopt big data analytics?
1. Weak institutional linkages. Over the past few years some development agencies have created data centers to explore ways to exploit new information technologies. These centers are mainly staffed by people with a background in data science or statistics and the institutional links to the agency’s evaluation office are often weak.
2. Many evaluators have limited familiarity with big data/analytics. Evaluation training programs tend to only present conventional experimental, quasi-experimental and mixed-methods/qualitative designs. They usually do not cover smart data analytics (see Part 1 of this blog). Similarly, many data scientists do not have a background in conventional evaluation methodology (though there are of course exceptions).
3. Methodological differences. Many big data approaches do not conform to the basic principles that underpin conventional program evaluation, for example:
Data quality: real-time big data provides one of the potentially most powerful sources of data for development programs. Among other things, real-time data can provide early warning signals of potential diseases (e.g. Google Flu), ethnic tension, drought and poverty (Meier 2015). However, when an evaluator asks if the data is biased or of poor quality, the data analyst may respond “Sure the data is biased (e.g. only captured from mobile phone users or twitter feeds) and it may be of poor quality. All data is biased and usually of poor quality, but it does not matter because tomorrow we will have new data.” This reflects the very different kinds of data that evaluators and data analysts typically work with, and the difference can be explained, but a statement such as the above can create the impression that data analysts do not take issues of bias and data quality very seriously.
Data mining: Many data analytics methods are based on the mining of large data sets to identify patterns of correlation, which are then built into predictive models, normally using Bayesian statistics. Many evaluators frown on data mining due to its potentially identifying spurious associations.
The role of theory: Most (but not all) evaluators believe that an evaluation design should be based on a theoretical framework (theory of change or program theory) that hypothesizes the processes through which the intended outcomes will be achieved. In contrast, there is plenty of debate among data analysts concerning the role of theory, and whether it is necessary at all. Some even go as far as to claim that data analytics means “the end of theory”(Anderson 2008). This, combined with data mining, creates the impression among some evaluators that data analytics uses whatever data is easily accessible with no theoretical framework to guide the selection of evaluation questions as to assess the adequacy of available data.
Experimental designs versus predictive analytics: Most quantitative evaluation designs are based on an experimental or quasi-experimental design using a pretest/posttest comparison group design. Given the high cost of data collection, statistical power calculations are frequently used to estimate the minimum size of sample required to ensure a certain level of statistical significance. Usually this means that analysis can only be conducted on the total sample, as sample size does not permit statistical significance testing for sub-samples. In contrast, predictive analytics usually employ Bayesian probability models. Due to the low cost of data collection and analysis, it is usually possible to conduct the analysis on the total population (rather than a sample), so that disaggregated analysis can be conducted to compare sub-populations, and often (particularly when also using machine learning) to compute outcome probabilities for individual subjects. There continues to be heated debates concerning the merits of each approach, and there has been much less discussion of how experimental and predictive analytics approaches could complement each other.
As Pete York at CommunityScience.com observes: “Herein lies the opportunity – we evaluators can’t battle the wave of big data and data science that will transform the way we do research. However, we can force it to have to succumb to the rules of objective rigor via the scientific method. Evaluators/researchers train people how to do it, they can train machines. We are already doing so.” (Personal communication 8/7/17)
4. Ethical and political concerns: Many evaluators also have concerns about who designs and markets big data apps and who benefits financially. Many commercial agencies collect data on low income populations (for example their consumption patterns) which may then be sold to consumer products companies with little or no benefit going to the populations from which the information is collected. Some of the algorithms may also include a bias against poor and vulnerable groups (O’Neil 2016) that are difficult to detect given the proprietary nature of the algorithms.
Another set of issues concern whether the ways in which big data are collected and used (for making decisions affecting poor and vulnerable groups) tends to be exclusive (governments and donors use big data to make decisions about programs affecting the poor without consulting them), or whether big data is used to promote inclusion (giving voice to vulnerable groups). These issues are discussed in a recent Rockefeller Foundation blog. There are also many issues around privacy and data security. There is of course no simple answer to these questions, but many of these concerns are often lurking in the background when evaluators are considering the possibility of incorporating big data into their evaluations.
Table 1. Reasons evaluators have been slow to adopt big data and opportunities for bridge building between evaluators and data analysts
Reason for slow adoption
Opportunities for bridge building
1. Weak institutional linkages
Strengthening formal and informal links between data centers and evaluators
2. Evaluators have limited knowledge about big data and data analytics
Capacity development programs covering both big data and conventional evaluation
Collaborative pilot evaluation projects
3. Methodological differences
Creating opportunities for dialogue to explore differences and to determine how they can be reconciled
Viewing data analytics and evaluation as being complementary rather than competing
4. Ethical and political concerns about big data
Greater focus on ethical codes of conduct, privacy and data security
Focusing on making approaches to big data and evaluation inclusive and avoiding exclusive/extractive approaches
Building bridges between evaluators and big data/analytics
There are a number of possible steps that could be taken to build bridges between evaluators and big data analysts, and thus to promote the integration of big data into development evaluation. Catherine Cheney (2016) presents interviews with a number of data scientists and development practitioners stressing that data driven development needs both social and computer scientists. No single approach is likely to be successful, and the best approach(es) will depend on each specific context, but we could consider:
Strengthening the formal and informal linkages between data centers and evaluation offices. It may be possible to achieve this within the existing organizational structure, but it will often require some formal organizational changes in terms of lines of communication. Linda Raftree provides a useful framework for understanding how different “buckets” of data (including among others, traditional data and big data) can be brought together, which suggests one pathway to collaboration between data centers and evaluation offices.
Identifying opportunities for collaborative pilot projects. A useful starting point may be to identify opportunities for collaboration on pilot projects in order to test/demonstrate the value-added of cooperation between the data analysts and evaluators. The pilots should be carefully selected to ensure that both groups are involved equally in the design of the initiative. Time should be budgeted to promote team-building so that each team can understand the other’s approach.
Promoting dialogue to explore ways to reconcile differences of approach and methodology between big data and evaluation. While many of these differences may at first appear to be based on fundamental differences of approach, at least some differences result at least in part from questions of terminology and in other cases it may be that different approaches can be applied at different stages of the evaluation process. For example:
Many evaluators are suspicious of real-time data from sources such as twitter, or analysis of phone records due to selection bias and issues of data quality. However, evaluators are familiar with exploratory data (collected, for example, during project visits, or feedback from staff), which is then checked more systematically in a follow-up study. When presented in this way, the two teams would be able to discuss in a non-confrontational way, how many kinds of real-time data could be built into evaluation designs.
When using Bayesian probability analysis it is necessary to begin with a prior distribution. The probabilities are then updated as more data becomes available. The results of a conventional experimental design can often be used as an input to the definition of the prior distribution. Consequently, it may be possible to consider experimental designs and Bayesian probability analysis as sequential stages of an evaluation rather than as competing approaches.
Integrated capacity development programs for data analysts and evaluators. These activities would both help develop a broader common methodological framework and serve as an opportunity for team building.
There are a number of factors that together explain the slow take-up of big data and data analytics by development evaluators. A number of promising approaches are proposed for building bridges to overcoming these barriers and to promote the integration of big data into development evaluation.
We are living in an increasingly quantified world.
There are multiple sources of data that can be generated and analyzed in real-time. They can be synthesized to capture complex interactions among data streams and to identify previously unsuspected linkages among seemingly unrelated factors [such as the purchase of diapers and increased sales of beer]. We can now quantify and monitor ourselves, our houses (even the contents of our refrigerator!), our communities, our cities, our purchases and preferences, our ecosystem, and multiple dimensions of the state of the world.
These rich sources of data are becoming increasingly accessible to individuals, researchers and businesses through huge numbers of mobile phone and tablet apps and user-friendly data analysis programs.
The influence of digital technology on international development is growing.
Many of these apps and other big data/data analytics tools are now being adopted by international development agencies. Due to their relatively low cost, ease of application, and accessibility in remote rural areas, the approaches are proving particularly attractive to non-profit organizations; and the majority of NGOs probably now use some kind of mobile phone apps.
Apps are widely-used for early warning systems, emergency relief, dissemination of information (to farmers, mothers, fishermen and other groups with limited access to markets), identifying and collecting feedback from marginal and vulnerable groups, and permitting rapid analysis of poverty. Data analytics are also used to create integrated data bases that synthesize all of the information on topics as diverse as national water resources, human trafficking, updates on conflict zones, climate change and many other development topics.
Table 1: Widely used big data/data analytics applications in international development
Big data/data analytics tools
Early warning systems for natural and man-made disasters
Analysis of Twitter, Facebook and other social media
Analysis of radio call-in programs
Satellite images and remote sensors
Electronic transaction records [ATM, on-line purchases]
GPS mapping and tracking
Dissemination of information to small farmers, mothers, fishermen and other traders
Feedback from marginal and vulnerable groups and on sensitive topics
Rapid analysis of poverty and identification of low-income groups
Analysis of phone records
Social media analysis
Satellite images [e.g. using thatched roofs as a proxy indicator of low-income households]
Electronic transaction records
Creation of an integrated data base synthesizing all the multiples sources of data on a development topic
National water resources
Agricultural conditions in a particular region
Evaluation is lagging behind.
Surprisingly, program evaluation is the area that is lagging behind in terms of the adoption of big data/analytics. The few available studies report that a high proportion of evaluators are not very familiar with big data/analytics and significantly fewer report having used big data in their professional evaluation work. Furthermore, while many international development agencies have created data development centers within the past few years, many of these are staffed by data scientists (many with limited familiarity with conventional evaluation methods) and there are weak institutional links to agency evaluation offices.
A recent study on the current status of the integration of big data into the monitoring and evaluation of development programs identified a number of reasons for the slow adoption of big data/analytics by evaluation offices:
Weak institutional links between data development centers and evaluation offices
Differences of methodology and the approach to data generation and analysis
Issues concerning data quality
Concerns by evaluators about the commercial, political and ethical nature of how big data is generated, controlled and used.
Key questions for the future of evaluation in international development…
The above gives rise to two sets of questions concerning the future role of evaluation in international development:
The future direction of development evaluation. Given the rapid expansion of big data in international development, it is likely there will be a move towards integrated program information systems. These will begin to generate, analyze and synthesize data for program selection, design, management, monitoring, evaluation and dissemination. A possible scenario is that program evaluation will no longer be considered a specialized function that is the responsibility of a separate evaluation office, rather it will become one of the outputs generated from the program data base. If this happens, evaluation may be designed and implemented not by evaluation specialists using conventional evaluation methods (experimental and quasi-experimental designs, theory-based evaluation) but by data analysts using methods such as predictive analytics and machine learning.
Key Question: Is this scenario credible? If so how widespread will it become and over what time horizon? Is it likely that evaluation will become one of the outputs of an integrated management information system? And if so is it likely that many of the evaluation functions will be taken over by big data analysts?
The changing role of development evaluators and the evaluation office. We argued that currently many or perhaps most development evaluators are not very familiar with big data/analytics, and even fewer apply these approaches. There are both professional reasons (how evaluators and data scientists are trained) and organizational reasons (the limited formal links between evaluation offices and data centers in many organizations) that explain the limited adoption of big data approaches by evaluators. So, assuming the above scenario proves to be at least partially true, what will be required for evaluators to become sufficiently conversant with these new approaches to be able to contribute to how big data/focused evaluation approaches are designed and implemented? According to Pete York at Communityscience.com, the big challenge and opportunity for evaluators is to ensure that the scientific method becomes an essential part of the data analytics toolkit. Recent studies by the Global Environmental Faciity (GEF) illustrate some of the ways that big data from sources such as satellite images and remote sensors can be used to strengthen conventional quasi-experimental evaluation designs. In a number of evaluations these data sources used propensity score matching to select matched samples for pretest-posttest comparison group designs to evaluate the effectiveness of programs to protect forest cover or reserves for mangrove swamps.
Key Question: Assuming there will be a significant change in how the evaluation function is organized and managed, what will be required to bridge the gap between evaluators and data analysts? How likely is it that the evaluators will be able to assume this new role and how likely is it that organizations will make the necessary adjustments to facilitate these transformations?
What do you think? How will these scenarios play out?
Note: Stay tuned for Michael’s next post focusing on how to build bridges between evaluators and big data analysts.
Below are some useful references if you’d like to read more on this topic:
Bamberger, M., Raftree, L and Olazabal, V (2016) The role of new information and communication technologies in equity–focused evaluation: opportunities and challenges. Evaluation. Vol 22(2) 228–244 . A discussion of the ethical issues and challenges with new information technology
Meier , P (2015) Digital Humanitarians: How big data is changing the face of humanitarian response. CRC Press. A review, with detailed case studies, of how digital technology is being used by NGOs and civil society.
O’Neill, C (2016) The weapons of math destruction: How big data increases inequality and threatens democracy. How widely-used digital algorithms negatively affect the poor and marginalized sectors of society. Crown books.
Petersson, G.K and Breul, J.D (editors) (2017) Cyber society, big data and evaluation. Comparative policy evaluation. Volume 24. Transaction Publications. The evolving role of evaluation in cyber society.