Tag Archives: how to

Improve Data Literacy at All Levels within Your Humanitarian Programme

This post is by Janna Rous at Humanitarian Data. The original was published here on April 29, 2018

Imagine this picture of data literacy at all levels of a programme:

You’ve got a “donor visit” to your programme. The country director and a project officer accompany the donor on a field trip, and they all visit a household within one of the project communities.  All sat around a cup of tea, they started a discussion about data.  In this discussion, the household members explained what data had been collected and why. The country director explained what had surprised him/her in the data.  And the donor discussed how they made a decision to fund the programme based on the data.  What if no one was surprised at the discussion, or how the data was used, because they’d ALL seen and understood the data process?

Data literacy can mean lots of different things depending on who you are.  It could mean knowing how to:

  • collect, analyze and use data;
  • make sense of data and use it for management
  • validate data, be critical of it,
  • tell good from bad data and knowing how credible it is;
  • ensure everyone is confident talking about data.


“YES” data literacy is a priority!  Poor data literacy is still a huge stumbling block for many people in the sector and needs to be improved at ALL levels – from community households to field workers to senior management to donors.  However, there are a few challenges in how this priority is worded.


Suggesting someone is “illiterate” when it comes to data – that doesn’t sit well with most people.  Many aid workers – from senior HQ staff right down to beneficiaries of a humanitarian programme – are well-educated and successful. Not only are they literate, but most speak 2 or more languages!  So to insinuate “illiteracy” doesn’t feel right.

Illiteracy is insulting…

Many of these same people are not super-comfortable with “data”,  but to ask them if they “struggle” with data, or to suggest they “don’t understand” by claiming they are “data illiterate” is insulting (even if you think it’s true!).

Leadership is enticing…

The language you use is extremely important here.  Instead of “literacy”, should you be talking about “leadership”?  What if you framed it as:  Improving data leadership.  Could you harness the desirability of that skill – leadership – so that workshop and training titles played into people’s egos, instead of attacking their egos?


You might be directly involved with helping to improve data literacy within your own organization.  Here are a few ideas on how to improve general data literacy/leadership:

  • Training and courses around data literacy.

While courses that focus on data analysis using computer programming languages such as [R] or Python exist, it might be better to focus on skills-development on more popular software (such as Excel) which is more sustainable. Due to the high turnover of staff within your sector, complex data analysis cannot normally be sustained once an advanced analyst leaves the field.

  • Donor funding to promote data use and the use of technology.

While the sector should not only rely on donors for pushing the agenda of data literacy forward, money is powerful.  If NGOs and agencies are required to show data literacy in order to receive funding, this will drive a paradigm shift in becoming more data-driven as a sector.  There are still big questions on how to fund interoperable tech systems in the sector to maximize the value of that funding in collaboration between multiple agencies.  However, donors who can provide structures and settings for collaboration will be able to promote data literacy across the sector.

  • Capitalize on “trendy” knowledge – what do people want to know about because it makes them look intelligent?

In 2015/16, everyone wanted to know “how to collect digital data”.  A couple years later, most people had shifted – they wanted to know “how to analyze data” and “make a dashboard”.  Now in 2018, GDPR and “Responsible Data” and “Blockchain” are trending – people want to know about it so they can talk about it.  While “trends” aren’t all we should be focusing on, they can often be the hook that gets people at all levels of our sector interested in taking their first steps forward in data literacy.


Data literacy means something completely different depending on who you are, your perspective within a programme, and what you use data for.

To the beneficiary of a programme…

data literacy might just mean understanding why data is being collected and what it is being used for.  It means having the knowledge and power to give and withhold consent appropriately.

To a project manager…

data literacy might mean understanding indicator targets, progress, and the calculations behind those numbers, in addition to how different datasets relate to one another in a complex setting.  Managers need to understand how data is coming together so that they can ask intelligent questions about their programme dashboards.

To an M&E officer…

data literacy might mean an understanding of statistical methods, random selection methodologies, how significant a result may be, and how to interpret results of indicator calculations.  They may need to understand uncertainty within their data and be able to explain this easily to others.

To the Information Management team…

data literacy might mean understanding how to translate programme calculations into computer code.  They may need to create data collection or data analysis or data visualization tools with an easy-to-understand user-interface.  They may ultimately be relied upon to ensure the correctness of the final “number” or the final “product”.

To the data scientist…

data literacy might mean understanding some very complex statistical calculations, using computer languages and statistical packages to find trends, insights, and predictive capabilities within datasets.

To the management team…

data literacy might mean being able to use data results (graphs, charts, dashboards) to explain needs, results, and impact in order to convince and persuade. Using data in proposals to give a good basis for why a programme should exist or using data to explain progress to the board of directors, or even as a basis for why a new programme should start up….or close down.

To the donor…

data literacy might mean an understanding of a “good” needs assessment vs. a “poor one” in evaluating a project proposal, how to prioritize areas and amounts of funding, how to ask tough questions of an individual partner, how to be suspect of numbers that may be too good to be true, how to evaluate quality vs. quantity, or how to see areas of collaboration between multiple partners.  They need to use data to communicate international priorities to their own wider government, board, or citizens.

Use more precise wording

Data literacy means something different to everyone.  So this priority can be interpreted in many different ways depending on who you are.  Within your organization, frame this priority with a more precise wording.  Here are some examples:

  • Improve everyone’s ability to raise important questions based on data.
  • Let’s get better at discussing our data results.
  • Improve our leadership in communicating the meaning behind data.
  • Develop our skills in analyzing and using data to create an impact.
  • Improve our use of data to inform our decisions.

This blog article was based on a recent session at MERL Tech UK 2018.  Thanks to the many voices who contributed ideas.  I’ve put my own spin on them to create this article – so if you disagree, the ideas are mine.  And if you agree – kudos to the brilliant people at the conference!


Register now for MERL Tech Jozi, August 1-2 or MERL Tech DC, September 6-7, 2018 if you’d like to join the discussions in person!


MERL Tech 101: Google forms

by Daniel Ramirez-Raftree, MERL Tech volunteer

In his MERL Tech DC session on Google Forms, Samhir Vesdev from IREX led a hands-on workshop on Google Forms and laid out some of the software’s capabilities and limitations. Much of the session focused on Google Forms’ central concepts and the practicality of building a form.

At its most fundamental level, a form is made up of several sections, and each section is designed to contain a question or prompt. The centerpiece of a section is the question cell, which is, as one would imagine, the cell dedicated to the question. Next to the question cell there is a drop down menu that allows one to select the format of the question, which ranges from multiple-choice to short answer.

At the bottom right hand corner of the section you will find three dots arranged vertically. When you click this toggle, a drop-down menu will appear. The options in this menu vary depending on the format of the question. One common option is to include a few lines of description, which is useful in case the question needs further elaboration or instruction. Another is the data validation option, which restricts the kinds of text that a respondent can input. This is useful in the case that, for example, the question is in a short answer format but the form administrators need the responses to be limited numerals for the sake of analysis.

The session also covered functions available in the “response” tab, which sits at the top of the page. Here one can find a toggle labeled “accepting responses” that can be turned off or on depending on the needs for the form.

Additionally, in the top right corner this tab, there are three dots arranged vertically, and this is the options menu for this tab. Here you will find options such as enabling email notifications for each new response, which can be used in case you want to be alerted when someone responds to the form. Also in this drop down, you can click “select response destination” to link the Google Form with Google Sheets, which simplifies later analysis. The green sheets icon next to the options drop-down will take you to the sheet that contains the collected data.

Other capabilities in Google Forms include the option for changing the color scheme, which you can access by clicking the palette icon at the top of the screen. Also, by clicking the settings button at the top of the screen you can limit the response amount to restrict people’s ability to skew the data by submitting multiple responses, or you can enable response editing after submission to allow respondents to go in and correct their response after submitting it.

Branching is another important tool in Google Forms. It can be used in the case that you want a particular response to a question (say, a multiple choice question) to lead the respondent to another related question only if they respond in a certain way.

For example, if in one section you ask “did you like the workshop?” with the answer options being “yes” and “no,” and if you want to know what they didn’t like about the workshop only if they answer “no,” you can design the sheet to take the respondent to a section with the question “what didn’t you like about the workshop?” only in the case that they answer “no,” and then you can design the sheet to bring the respondent back to the main workflow after they’ve answered this additional question.

To do this, create at least two new sections (by clicking “add section” in the small menu to the right of the sections), one for each path that a person’s response will lead them down. Then, in the options menu on the lower right hand side select “go to section based on answer” and using the menu that appears, set the path that you desire.

These are just some of the tools that Google Forms offers, but with just these it is possible to build an effective form to collect the data you need. Samhir ended with a word of caution that Google has been known to shut down popular apps, so you should be wary about building an organization strategy around Google Forms.

Qualitative Coding: From Low Tech to High Tech Options

by Daniel Ramirez-Raftree, MERL Tech volunteer

In their MERL Tech DC session on qualitative coding, Charles Guedenet and Anne Laesecke from IREX together with Danielle de Garcia of Social Impact offered an introduction to the qualitative coding process followed by a hands-on demonstration on using Excel and Dedoose for coding and analyzing text.

They began by defining content analysis as any effort to make sense of qualitative data that takes a volume of qualitative material and attempts to identify core consistencies and meanings. More concretely, it is a research method that uses a set of procedures to make valid inferences from text. They also shared their thoughts on what makes for a good qualitative coding method.

Their belief is that: it should

  • consider what is already known about the topic being explored
  • be logically grounded in this existing knowledge
  • use existing knowledge as a basis for looking for evidence in the text being analyzed

With this definition laid out, they moved to a discussion about the coding process where they elaborated on four general steps:

  1. develop codes and a codebook
  2. decide on a sampling plan
  3. code your data
  4. go back and do it again!
  5. test for reliability

Developing codes and a codebook is important for establishing consistency in the coding process, especially if there will be multiple coders working on the data. A good way to start developing these codes is to consider what is already known. For example, you can think about literature that exists on the subject you’re studying. Alternatively, you can simply turn to the research questions the project seeks to answer and use them as a guide for creating your codes. Beyond this, it is also useful to go through the content and think about what you notice as you read. Once a codebook is created, it will lend stability and some measure of objectivity to the project.

The next important issue is the question of sampling. When determining sample size, though a larger sample will yield more robust results, one must of course consider the practical constraints of time, cost and effort. Does the benefit of higher quality results justify the additional investment? Fortunately, the type of data will often inform sampling. For example, if there is a huge volume of data, it may be impossible to analyze it all, but it would be prudent to sample at least 30% of it. On the other hand, usually interview and focus group data will all be analyzed, because otherwise the effort of obtaining the data would have gone to waste.

Regarding sampling method, session leads highlighted two strategies that produce sound results. One is systematic random sampling and the other is quota sampling–a method employed to ensure that the proportions of demographic group data are fairly represented.

Once these key decisions have been made, the actual coding can begin. Here, all coders should work from the same codebook and apply the codes to the same unit of analysis. Typical units of analysis are: single words, themes, sentences, paragraphs, and items (such as articles, images, books, or programs). Consistency is essential. A way to test the level of consistency is to have a 10% overlap in the content each coder analyzes and aim for 80% agreement between their coding of that content. If the coders are not applying the same codes to the same units this could either mean that they are not trained properly or that the code book needs to be altered.

Along a similar vein, the fourth step in the coding process is to test for reliability. Challenges in producing stable and consistent results in coding could include: using a unit of analysis that is too large for a simple code to be reliably applied, coding themes or concepts that are ambiguous, and coding nonverbal items. For each of these, the central problem is that the units of analysis leave too much room for subjective interpretation that can introduce bias. Having a detailed codebook can help to mitigate against this.

After giving an overview of the coding process, the session leads suggested a few possible strategies for data visualization. One is to use a word tree, which helps one look at the context in which a word appears. Another is a bubble chart, which is useful if one has descriptive data and demographic information. Thirdly, correlation maps are good for showing what sorts of relationships exist among the data. The leads suggested visiting the website stephanieevergreen.com/blog for more ideas about data visualization.

Finally, the leads covered low-tech and high-tech options for coding. On the low-tech end of the spectrum, paper and pen get the job done. They are useful when there are few data sources to analyze, when the coding is simple, and when there is limited tech literacy among the coders. Next up the scale is Excel, which works when there are few data sources and when the coders are familiar with Excel. Then the session leads closed their presentation with a demonstration of Dedoose, which is a qualitative coding tool with advanced capabilities like the capacity to code audio and video files and specialized visualization tools. In addition to Dedoose, the presenters mentioned Nvivo and Atlas as other available qualitative coding software.

Despite the range of qualitative content available for analysis, there are a few core principles that can help ensure that it is analyzed well, these include consistency and disciplined methodology. And if qualitative coding will be an ongoing part of your organization’s operations, there are several options for specialized software that are available for you to explore. [Click here for links and additional resources from the session.]