3 NLP Interpretability Tools For Debugging Language Models

With constant advances and unprecedented performance on many NLP tasks, language models have gotten really complex and hard to debug. Researchers and engineers often can’t easily answer questions like: 
Why did your model make that prediction?  
Does your model have any algorithmic biases?
What kind of data samples does your model perform poorly on?
NLP interpretability tools help researchers and practitioners make more effective fine-tuning decisions on language models while saving time and resources. With the right toolkit, the researchers can spend less time on experiments with different techniques and input data and end up with a better understanding of model behavior, strengths, and limitations.
In this article, we feature three interpretability tools that can assist NLP researchers and engineers in debugging their models.
Language Interpretability Tool (LIT) 
The Google Research team, headed by Ian Tenney and James Wexler, recently introduced a new tool for visualization and analyses of NLP models. Their Language Interpretability Tool (LIT) has a browser-based user interface, making it accessible to practitioners of different skill levels. 

With its beginner-friendly interface that requires only a few lines of code to add models and data, LIT provides users with the ability to:

Explore the dataset interactively using different criteria and projections.
Find and explore interesting data points, like for example, off-diagonal groups from a confusion matrix, outliers, etc.
Explain local behavior by deep-diving into model behavior on individual data points.
Generate new data points on the fly either manually via edits or automatically using a range of counterfactual generators (e.g., back translations, adversarial attacks).
Compare side-by-side two or more models on the same data or the same model on two data points.
Compute metrics for the whole dataset, current selection, or some specific slices.

The tool is in active development and will likely extend its capabilities after getting some feedback from the NLP research community.
Research paper. Google’s research paper The Language Interpretability Tool: Extensible, Interactive Visualizations and Analysis for NLP Models will be presented at EMNLP 2020.
Code. Implementation code and full documentation are available on GitHub .
AllenNLP Interpret
The researchers from the Allen Institute for Artificial Intelligence and the University of California have introduced AllenNLP Interpret , a framework for interpreting NLP models. The tool focuses on instance-level interpretations of two types: gradient-based saliency maps and adversarial attacks. Besides a suite of interpretation techniques implemented for broad classes of models, the AllenNLP Interpret includes model- and task-agnostic APIs for developing new interpretation...