Keyphrase extraction aims at automatically extracting a list of ``important” phrases representing the key concepts in a document. Prior approaches for unsupervised keyphrase extraction resort to heuristic notions of phrase importance via embedding similarities or graph centrality, requiring extensive domain expertise to develop them. Our work presents an alternative operational definition: phrases that are most useful for predicting the topic of a text are keyphrases. To this end, we propose INSPECT—a self-explaining neural framework for identifying influential keyphrases by measuring the predictive impact of input phrases on the downstream task of topic classification. We show that this novel approach not only alleviates the need for ad-hoc heuristics but also achieves state-of-the-art results in unsupervised keyphrase extraction in 3 out of 4 diverse datasets across two domains: scientific publications and news articles. Ultimately, our study suggests a new usage of interpretable neural networks as an intrinsic component in NLP systems, and not only as a tool for explaining model predictions to humans.
Unsupervised Keyphrase Extraction via Interpretable Neural Networks
Publications
Unsupervised Keyphrase Extraction via Interpretable Neural Networks
Rishabh Joshi*, Vidhisha Balachandran*, Emily Saldanha, Maria Glenski, Svitlana Volkova, Yulia Tsvetkov