What’s happening with GenAI Ethics and Governance?
On April 18th, the NLP CoP Ethics & Governance Working Group hosted a virtual meeting, bringing together over 50 global development professionals including digital strategists, MERL practitioners, program leads, consultants, academics and founders. As this was our first session, we wanted to provide participants with an overview of the work already being undertaken within the sector in an attempt to grapple with the new and constantly shifting ethical and regulatory landscape transformed by the advent of more complex forms of AI.
Our speakers were:
- Elizabeth Shaughnessy, Digital Programmes Lead at Oxfam GB, discussing the Humanitarian AI Code of Conduct developed by members of the NetHope AI Working Group;
- Natasha Beale, Associate Director for the Evaluation and Learning Unit at The Asia Foundation, sharing her organization’s approach to developing internal guidance for staff;
- And myself, sharing the MERL Tech Initiative’s work creating ethical guidelines to support Girl Effect’s GenAI-powered sexual health chatbots.
By moving from overarching principles, to more practical use-cases for the safe and ethical use of AI, we hoped to provide participants with a foundational understanding of the emerging considerations and advice. At the same time, we also wanted to open the floor to participants, letting them share the questions and tasks they were confronted with in their professional (and personal) lives, in an effort to map the needs of the community and plan future activities and resources.
Co-developing a Humanitarian AI Code of Conduct
Kicking off our presentations, Elizabeth shared the core tenets of the Humanitarian AI Code of Conduct, a set of guiding principles developed by NetHope members in the past year. The Code is intended as a framework that reflects leading concerns for organizations, but which remains at the same time a flexible tool that can adapt to the pace of AI development. Interestingly, Elizabeth’s leadership in this initiative is rooted in her background working in biometrics, the use (and misuse) of which date back 20 years, before the responsible data movement had taken shape. The parallels with AI are clear, as both include the harvesting, storage, manipulation, and often, monetization, of extremely sensitive data, as well as the realization that whilst it’s unlikely that we can stop the proliferation of this new form of technology, we can come together as a sector to anticipate and regulate some of the major risks for ourselves and the community members we aim to support.
The Code of Conduct itself is firmly rooted in humanitarian principles, including the principle of Do No Harm, and the need to have a net positive impact on both the organizations deploying AI tools, and the communities they serve – “not just non-maleficence, but beneficence too” At the same time, they recommend that organizations’ use of AI and related data practices must be fair, inclusive, accessible, and feminist. Additionally, it must mitigate, not exacerbate, bias, and ensure transparency and explainability (meaning that a machine learning model and its output can be explained in a way that “makes sense” to a human being).
The Code of Conduct also provides explicit commitments linked to the most high-risk concerns, including refraining from using GenAI to generate photo-realistic images of children or programme participants for purposes of publication, and making sure to review, contextualize and attribute content generated by or with the help of AI for publication. Finally, it underlines the importance of prioritizing safeguarding and child protection, where necessary establishing additional guardrails to protect the most vulnerable community members. Elizabeth explained that the final section of the Code of Conduct emphasizes the importance of close and continued collaboration across the sector in order to build capacity and skills equitably and to ensure alignment on technical activities. This includes areas of due diligence, procurement of tools and vendors, and the responsible use of data.
An additional point, which I was particularly interested in given my work on User Experience Design, was a commitment to understand and address the continued relevance of ‘informed consent’ in the context of Large Language Models, and whether it can be meaningfully used any longer as a basis for collecting personal data. Put more simply: when the developers of LLMs sometimes barely understand how their tools are using the data they are processing to generate outputs, how can we expect a user to understand it, let alone give consent? The work of organizations such as Here I Am, whose platform Fatima explicitly breaks down the informed consent process into manageable bites, goes some way to addressing this issue. Clearly more needs to be done to address this paradigm shift – and this for the benefit of all of us, not just the community members we support.
What do internal guidelines for using GenAI responsibly look like?
As well as seeking guidance to inform our overarching approaches as a sector, many of us are concerned with how AI, specifically GenAI, is already affecting our day to day practices. Many organizations have already started issuing guidelines, including the BBC, whose recently released guidance was shared by a colleague from BBC Media Action during the meeting. Our next presenter, Natasha Beale, shared The Asia Foundation’s Internal Considerations for GenAI Use, which had been developed with a mandate to approach GenAI through the lens of data responsibility. It aims to provide advice to staff on how to ethically and effectively use AI at work. This first draft set of Considerations will soon expand to cover the use of enterprise level tools such as Microsoft co-pilot and procurement, as well as MERL processes and considerations for grantee and vendor selection.
The Asia Foundation’s Guidance includes an overview of the risks of using GenAI, namely the reinforcement of bias and perpetuation of discrimination, the harm caused by inaccurate or harmful responses, the reputational and legal risks from data privacy infringements, but also less tangible harms related to wider ethical and societal concerns.
Natasha emphasized that their guidance had taken into consideration different types of ‘Users’ i.e persons or groups digesting and attempting to implement its advice. This is an important and much-neglected factor when developing guidance, as it ensures that specific needs and use cases will ideally be catered for in the guidance itself. It’s an approach I recommend myself as I’ll go on to explain.
The Asia Foundation’s guidance provides staff with general considerations for using tools such as ChatGPT as part of their work including advice on effective prompt writing, checking for plagiarism, bias and other potential harms promoted by the answers generated. It also gives very clear red lines on what not to do, including inputting personal or sensitive program related information (including from program participants!) into a GenAI tool.
At the same time, Natasha pointed out that there are real challenges with implementing these considerations, not least the pace of change. It takes time and capacity to monitor developments and vet new tools and vendors in an NGO environment that already moves slowly as a result of uneven capacity and data resources. She stressed the importance of asking whether your organization has the capacity to implement a policy. One reason that the Asia Foundation developed “Considerations” rather than a “Policy” on GenAI is that Policies are auditable, and it’s important to ensure organizational capacity is developed before moving from Considerations and Guidance to Policy.
Getting specific: ethical guidelines for the use of AI powered chatbots for Comprehensive Sexuality Education
In the final presentation, I shared The MERL Tech Initiative’s own work to develop guidelines for Girl Effect, an INGO which builds media and digital tools to support girls in Africa and Asia, including chatbots for sexual and reproductive health. (See their fantastic AI/ML Vision Document). Similarly to Natasha, I felt it was important to recognise that in our (understandable) rush to produce guidelines, we can sometimes neglect to consider how they can be designed to be usable by various stakeholders. A data scientist with specialist technical understanding, and a program manager with little hands-on experience but significant upstream responsibilities for grant-making and procurement will have very different perspectives, for example. In our development of these guidelines, we considered these various roles and responsibilities, spelling out the relevance of the guidance for specific workstreams, and breaking down guidelines in line with project stages (for example discovery, design, build, evaluate).
I reiterated the various ethical risks associated with using GenAI to deliver sexual health support to girls and women, the most salient of which are the potential for contextual irrelevance, unreliability and unreplicability of the answers provided by a GenAI powered chatbot. I also provided an overview of the top level recommendations which should act as lodestars for all implementers, whatever their specific role in the development of a sexual health chatbot. These mirrored much of the guidance shared by both Natasha and Elizabeth (which is not a coincidence!), including privacy by design, the need to monitor and evaluate more regularly and build in additional safeguarding mechanisms, and to prioritize maintenance not just innovation when seeking funding.
I also highlighted that many of the top level recommendations were in no way unique to the GenAI ‘moment’ we are currently reacting to. The need to be collaborative and participatory, to minimize data collection and to be transparent, accessible and accountable, are principles which we have been striving to adhere to for the last decade + of digital development practice. We should not lose sight of existing best practices and lessons learned in our attempts to grapple with new forms of communication with our users.
How can the Ethics & Governance Working Group help you?
We wrapped up the meeting by giving participants the chance to ask questions of the presenters. We also asked them to share broader questions or tasks that they were grappling with, so that we can understand better what future events and resources would be most helpful.
The needs that emerged from this consultation are broad. They range from seeking guidance on operational activities and internal capacity building (“How can I develop my own internal AI policy? How can I adapt an existing policy?”), to support for using AI ethically as part of MERL activities (“How can we tell if the AI model/practices of one of our grantees is any good? What questions should we be asking? How can we verify its quality? What is best practice when it comes to using AI tools to analyze quantitative and qualitative program data?”).
We’ll be using these insights to help us organize future events and develop resources and an overall work plan for the Ethics and governance Working Group. We’ll keep you updated on this as it emerges. In the meantime, I hope you’ll find this detailed recap and the resources linked to throughout and below, useful as you continue your own journey engaging with the ethical implications of AI.
We need funding to enable the Working Group and its members to respond to the various questions raised (both practical and theoretical) and to implement a few specific joint projects. Learn more about how you can support the NLP-CoP and its working groups. If you are interested in sponsoring or supporting this work, please get in touch with Isabelle or Linda – we’d love to discuss this!
- If you would like to add your own questions, concerns, and project ideas to the list of things for the Working Group to tackle, you can still do so here.
- You can access the slides and recording from this event here.
- If you’re not already a Community of Practice member, join here – you’ll also be able to sign up to one or more of our many Working Groups including GenAI Ethics & Governance, GenAI for Social and Behavior Change, and Humanitarian AI