Miscellaneous 2020

Learning explanations that are hard to vary

{In this paper, we investigate the principle that \textasciigravegood explanations are hard to vary\textquotesingle in the context of deep learning. We show that averaging gradients across examples -- akin to a logical OR of patterns -- can favor memorization and \textasciigravepatchwork\textquotesingle solutions that sew together different strategies, instead of identifying invariances. To inspect this, we first formalize a notion of consistency for minima of the loss surface, which measures to what extent a minimum appears only when examples are pooled. We then propose and experimentally validate a simple alternative algorithm based on a logical AND, that focuses on invariances and prevents memorization in a set of real-world tasks. Finally, using a synthetic dataset with a clear distinction between invariant and spurious mechanisms, we dissect learning signals and compare this approach to well-established regularizers.}

Author(s): Parascandolo, G and Neitz, A and Orvieto, A and Gresele, L and Schölkopf, B
Year: 2020
Bibtex Type: Miscellaneous (misc)
Electronic Archiving: grant_archive

BibTex

@misc{item_3251193,
  title = {{Learning explanations that are hard to vary}},
  abstract = {{In this paper, we investigate the principle that \textasciigravegood explanations are hard to vary\textquotesingle in the context of deep learning. We show that averaging gradients across examples -- akin to a logical OR of patterns -- can favor memorization and \textasciigravepatchwork\textquotesingle solutions that sew together different strategies, instead of identifying invariances. To inspect this, we first formalize a notion of consistency for minima of the loss surface, which measures to what extent a minimum appears only when examples are pooled. We then propose and experimentally validate a simple alternative algorithm based on a logical AND, that focuses on invariances and prevents memorization in a set of real-world tasks. Finally, using a synthetic dataset with a clear distinction between invariant and spurious mechanisms, we dissect learning signals and compare this approach to well-established regularizers.}},
  year = {2020},
  slug = {item_3251193},
  author = {Parascandolo, G and Neitz, A and Orvieto, A and Gresele, L and Sch\"olkopf, B}
}