Author: Matthew Renze
Published: 2023-04-30

Summary
LLMs can be used to create human-readable explanations for decisions made by AI systems.

Abstract
This paper explores generating Natural Language Explanations (NLEs) for eXplainable AI (XAI) using self-correcting Large Language Models (LLMs). A dataset combining COMPAS and SHAP values was used to compare a rule-based NLE generator with an LLM employing few-shot learning, verification, and correction tasks. GPT-4 achieved the highest factual accuracy but was less efficient in runtime and cost compared to GPT-3.5. These findings demonstrate LLMs’ potential for trustworthy AI while highlighting areas for further improvement.

Resources

My final project for Values and Ethics in Artificial Intelligence at Johns Hopkins University.