Evaluating and Mitigating Factual Inconsistencies in Language Generation

Abstract

Despite the large scale of computation and data, current large language models (LLMs) still have a propensity to generate hallucinations and factual inconsistencies in unpredictable ways. In this talk, I will present work on understanding the types of factual errors in generated text, how to evaluate them and mitigate them. I will first discuss the problem of hallucinations in language generation and discuss a couple of works that attempt to quantify them and categorize them and present a method to post edit and correct diverse types of factual errors in the model generated text. Then, I will briefly present an approach to pretrain models with fact-oriented synthetic data to improve error detection across domains. I will finally conclude with some open-questions and areas for future work in this space.

Date
Location
Remote
Avatar
Vidhisha Balachandran
Graduate Student at Language Technologies Institute