Cataloguing LLM evaluations
The paper proposes a taxonomy of the LLM evaluation landscape, comprising of five categories: General Capabilities, Domain Specific Capabilities, Safety and Trustworthiness, Extreme Risks, and Undesirable Use Cases. Read more
You might also like
-
Learning about the environmental impacts of data centers in Brazil with Rhavena Madeira and André Fernandes
-
Meta just dropped a bomb on chatbot builders. Here’s how it impacts the development and humanitarian sectors
-
Building AI chatbots people actually trust
-
Resumen del evento: lanzamiento del Grupo de Trabajo sobre IA y MERL en América Latina
