July 27, 2023

Universal and Transferable Adversarial Attacks on Aligned Language Models

Large language models (LLMs) are typically trained on massive text corpora scraped from the
internet, which are known to contain a substantial amount of objectionable content. In an attempt to make AI systems better aligned with human values. Read more

Qualitative researchers are not OK: adapting to the use of AI by research respondents
by Talitha Hlaka
Join Us on June 24 for the AI+Africa Working Group Meeting – Early findings from our landscape study
by Talitha Hlaka
Event Recap: Framing Made in Africa AI Approaches to MERL
by Talitha Hlaka
Event recap: “Should we be using AI right now?” A conversation about the ethical challenges of the current moment
by Talitha Hlaka

Universal and Transferable Adversarial Attacks on Aligned Language Models

You might also like