Big data, big problems, big solutions
by Alvaro Cobo-Santillan, Catholic Relief Services (CRS); Jeff Lundberg, CRS; Paul Perrin, University of Notre Dame; and Gillian Kerr, LogicalOutcomes Canada.
In the year 2017, with all of us holding a mini-computer at all hours of the day and night, it’s probably not too hard to imagine that “A teenager in Africa today has access to more information than the President of United States had 15 years ago”. So it also stands to reason that the ability to appropriately and ethically grapple with the use of that immense amount information has grown proportionately.
At the September MERL Tech event in Washington D.C. a panel that included folks from University of Notre Dame, Catholic Relief Services, and LogicalOutcomes spoke at length about three angles of this opportunity involving big data.
The Murky Waters of Development Data
What do we mean when we say that the world of development—particularly evaluation—data is murky? A major factor in this sentiment is the ambiguous polarity between research and evaluation data.
- “Research seeks to prove; evaluation seeks to improve.” – CDC
- “Research studies involving human subjects require IRB review. Evaluative studies and activities do not.”
This has led to debates as to the actual relationship between research and evaluation. Some see them as related, but separate activities, others see evaluation as a subset of research, and still others might posit that research is a specific case of evaluation.
But regardless, though motivations of the two may differ, research and evaluation look the same due to their stakeholders, participants, and methods.
If that statement is true, then we must hold both to similar protections!
What are some ways to make the waters less murky?
- Deeper commitment to informed consent
- Reasoned use of identifiers
- Need to know vs. nice to know
- Data security and privacy protocols
- Data use agreements and protocols for outside parties
- Revisit NGO primary and secondary data IRB requirements
Alright then, what can we practically do within our individual agencies to move the needle on data protection?
- In short, governance. Responsible data is absolutely a crosscutting responsibility, but can be primarily championed through close partnerships between the M&E and IT Departments
- Think about ways to increase usage of digital M&E – this can ease the implementation of R&D
- Can existing agency processes and resources be leveraged?
- Plan and expect to implement gradual behavior change and capacity building as a pre-requisite for a sustainable implementation of responsible data protections
- Think in an iterative approach. Gradually introduce guidelines, tools and training materials
- Plan for business and technical support structures to support protections
Is anyone doing any of the practical things you’ve mentioned?
Yes! Gillian Kerr from LogicalOutcomes spoke about highlights from an M&E system her company is launching to provide examples of the type of privacy and security protections they are doing in practice.
As a basis for the mindset behind their work, she notably presented a pretty fascinating and simple comparison of high risk vs. low risk personal information – year of birth, gender, and 3 digit zip code is unique for .04% of US residents, but if we instead include a 5 digit zip code over 50% of US residents could be uniquely identified. Yikes.
In that vein, they are not collecting names or identification and only year of birth (not month or day) and seek for minimal sensitive data defining data elements by level of risk to the client (i.e. city of residence – low, glucose level – medium, and HIV status – high).
In addition, asking for permission not only in the original agency permission form, but also in each survey. Their technical system maintains two instances – one containing individual level personal information with tight permission even for administrators and another with aggregated data with small cell sizes. Other security measures such as multi-factor authentication, encryption, and critical governance; such as regular audits are also in place.
It goes without saying that we collectively have ethical responsibilities to protect personal information about vulnerable people – here are final takeaways:
- If you can’t protect sensitive information, don’t collect it.
- If you can’t keep up with current security practices, outsource your M&E systems to someone who can.
- Your technology roadmap should aspire to give control of personal information to the people who provide it (a substantial undertaking).
- In the meantime, be more transparent about how data is being stored and shared
- Continue the conversation by visiting https://responsibledata.io/blog
Register for MERL Tech London, March 19-20th 2018! Session ideas due November 10th.
You might also like
-
Humans in the Machine: the Impact of AI on workers – Learn More on February 6th
-
Join us for the Gender, MERL and AI Working Group meeting kick-off!
-
The influence of Big Tech in 2025: 8 ways civil society can prepare for the incoming US administration
-
We’ve (mostly) banned AI assistants from NLP Community of Practice events. Here’s why.