Overview
The paper proposes to teach the encoder-decoder language models (the Transformers model on translation task and the PEGASUS model on summarization task) to abstain when receiving sentences substantial different from the training distribution; abstaining from generating some contents (more metaphorically, saying “I don’t know” when “I” really do not know) is indicates that the system is trustworthy; this practice improves the system safety.
Method
Given a domain-specific dataset \mathcal{D}_1 ={(x_1, y_1), \cdots, (x_N, y_N)} and a general-domain dataset (for example, C4) \mathcal{D}_0, the authors fit 4 Mahalanobis distance metrics using the representations of a model f(\cdot). The Mahalanobis distance is defined as \mathrm{MD}(\mathbf{x}; \mu, \Sigma) = (\mathbf{x} – \mu )^T \Sigma^{-1} (\mathbf{x} – \mu ); it is equivalent to -\log \mathcal{N}(\mu, \Sigma) up to a constant and a scalar difference.
| Notation | Fitting the Distance Metric On |
|---|---|
| $\mathrm{MD}_0(\cdot)$ | $\mathcal{D}_0$ |
| $\mathrm{MD}_\text{input}(\cdot)$ | ${x_1, \cdots, x_N}$ |
| $\mathrm{MD}_\text{output}(\cdot)$ | ${y_1, \cdots, y_N}$ |
| $\mathrm{MD}_\delta(\cdot)$ | ${f(x_1), \cdots, f(x_N)}$ |
Then the authors use either of the following two metrics to compute the OOD score of a test sample z; the w is decoded output of z. The idea of using relative distance comes from authors’ previous work on selective classification (see [3]).
- Input Relative MD (RMD) Score: \mathrm{RMD}_ \text{input}(z) = \mathrm{MD}_ \text{input}(z) – \mathrm{MD}_0(z).
- Output Relative MD (RMD) Score: \mathrm{RMD}_ \text{ouput}(w) = \mathrm{MD}_ \text{output}(w) – \mathrm{MD}_\delta(w).
If the scores indicate that the test sample z is an anomaly, the language model abstains from generating actual w; rather, it generates preset content such as “I don’t know.”
Experiments
- Perplexity should not be used for OOD detection alone because
- The fitted PDFs of perplexities on different datasets (i.e., domains) mostly overlap (Figure 1).
- When the averaged OOD scores increase, the Kentall’s \tau between perplexity and quality measure is (1) low and (2) decreases (Figure 4). If perplexity is a good measure, then the curve should be mostly flat.
- It could be combined with the proposed metric (Section 4.3, 4.4).
-
The proposed metrics perform differently for different types of tasks: \mathrm{RMD}_ \text{input}(\cdot) is more suitable for translation task and \mathrm{RMD}_ \text{output}(\cdot) is more suitable for summarization task. This may be because the summarization task is more “open-ended” than translation task.
-
The distance between domains could be quantitatively measured with the Jaccard similarity of n-grams (1 through 4 in the paper) (Table A.10). This is used to quantify the task difficulties as the authors define “near OOD” and “far OOD” domains (Table 1).
References
-
[1705.08500] Selective Classification for Deep Neural Networks
-
[1612.01474] Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles: This paper works on predictive uncertainty of deep classification models. Their proposed approach tries to approximate the state-of-the-art Bayesian NNs while being easy to implement and parallelize.
-
[2106.09022] A Simple Fix to Mahalanobis Distance for Improving Near-OOD Detection: For a classification problem of K classes, we could fit K class-dependent Gaussian and 1 background Gaussian. Then we could use these (K+1) Gaussians to detect anomalies: a negative score in class k indicates that the sample is in the domain k and a positive score means it is OOD; a more positive score shows that the sample deviates more from that domain.