January 20, 2023

Natural Language Processing Working Group kicks off!

*Natural Language Processing according to ChatGPT.*

On January 19th we kicked off the first meeting of the Natural Language Processing Working Group (NLPWG) (formerly the Text Analytics Working Group).

What is Natural Language Processing (NLP)?

Natural Language Processing, or “NLP” is a branch of AI where the aim is for computers to understand and draw insights from human language, text, and speech data. It has come into focus in the public eye most recently with the release of “ChatGPT.” Machine learning involves teaching a computer using a training data set and then asking it to make predictions for new data. (For a fairly accessible overview of how these models work, their potential uses, and their considerable societal impacts, see this paper)

Or simply ask ChatGPT!

How can NLP support the MERL sector?

NLP offers a number of exciting areas for MERL. As models become more advanced and better trained, they might support MERL practitioners with a whole range of tasks.

Though NLP is advancing quickly, we still need humans. We are certainly not advocating for humans to be cut out of the MERL equation! One question that needs answering is which MERL-related tasks are most relevant for the ‘skill set’ of NLP? As we’ve seen in the past, new technologies don’t always realize their promise of reducing our work load (hello, email!), and there is a tendency for cool new technologies to be something of a “solution looking for a problem.” We’ll need to be aware of these tendencies and other wider concerns too.

Examples of how NLP could support MERL. (from presentation by Matt McConnachie at NLPWG Meeting)

Concerns with the use of NLP

Multiple concerns have been raised over the past several years with the use of Artificial Intelligence (AI), Machine Learning and predictive analytics. These issues apply to NLP as well. Some key areas that have been flagged include the prominence of racial and gender bias; opaque or “black box” models; a bias towards the English language; and limited pre-trained models for specific nuanced contexts (such as for MERL and for the countries where development and humanitarian programs operate). In addition fine-tuning models can be costly and out of reach for many non-profit organizations, leading to exclusion of smaller, less resourced organizations and to consolidation and inequitable ‘capture’ of the profession (and data) by larger agencies and the private sector.

Adverse uses of AI (including NLP) include surveillance. The massive amounts of data captured and used, and the insights gleaned through use of Ai and NLP can violate privacy rights or be used in harmful ways. The production of fake news, and rapidly proliferating misinformation are also a concern. ChatGPT, for example, easily produces ‘fake evidence.” Property rights of the data sets being used to feed NLP models are being challenged. Environmental impacts and energy use for developing and running models can be high, and human rights and labor issues for the (often low paid) humans who train NLP models have come into greater focus recently. Finally, the wider impacts of NLP and AI on society need more scrutiny. In short, our societies have not yet learned how to effectively and fairly govern this technology to reduce the risks it poses to our lives.

Why a working group on NLP and MERL?

As we’ve outlined above, NLP offers a host of possible benefits to the MERL sector, yet concerns need to be addressed if NLP is to be used safely, ethically, and responsibly in our work. MERL practitioners can play a role in assessing and evaluating the impact of NPL and related tools on individuals, groups and society — but in order to do this, we need to better understand how AI and NLP function and where to look for risks, harms, biases and other types of rights violations.

The hope for the NLP Working Group is that together, MERL practitioners and data scientists can:

Raise awareness and understanding about what NLP technologies can do for MERL and how to use these technologies ethically and responsibly, especially in Global South contexts
Strengthen networks between MERL practitioners (including data scientists) to assist with the uptake and scaling of NLP technologies through the co-development of tools, systems and sharing of open data and code.

Over 100 people [update: 170 people] from around the world have signed up for the NLP Working Group. This diverse group brings a wide range of perspectives, skills, knowledge, and lived experiences that can help to shape the MERL sector’s work in this arena.

Join the NLP Working Group

The NLPWG is still open to new members. To join the group, send an email to Linda with “join NLPWG” as the subject line and we’ll add you to the list! We welcome all levels of MERL and NLP experience and skills and are especially interested in augmenting representation from majority world countries. New members will have access to recordings and other resource materials from past meetings.

We’ll share more information soon, as we develop topics, plans and other support for the group’s work, and we hope you’ll join us in exploring this area! If you have questions or ideas for the NLPWG, please get in touch.

ai group MERL MERL Tech NLP NLPWG working