{Learning explanations that are hard to vary}

Institute Homepage

Institute Homepage DE Sign In

Miscellaneous 2020

Learning explanations that are hard to vary

{In this paper, we investigate the principle that \textasciigravegood explanations are hard to vary\textquotesingle in the context of deep learning. We show that averaging gradients across examples -- akin to a logical OR of patterns -- can favor memorization and \textasciigravepatchwork\textquotesingle solutions that sew together different strategies, instead of identifying invariances. To inspect this, we first formalize a notion of consistency for minima of the loss surface, which measures to what extent a minimum appears only when examples are pooled. We then propose and experimentally validate a simple alternative algorithm based on a logical AND, that focuses on invariances and prevents memorization in a set of real-world tasks. Finally, using a synthetic dataset with a clear distinction between invariant and spurious mechanisms, we dissect learning signals and compare this approach to well-established regularizers.}

Author(s):	Parascandolo, G and Neitz, A and Orvieto, A and Gresele, L and Schölkopf, B
Year:	2020

Bibtex Type:	Miscellaneous (misc)

Electronic Archiving:	grant_archive

BibTex

@misc{item_3251193,
  title = {{Learning explanations that are hard to vary}},
  abstract = {{In this paper, we investigate the principle that \textasciigravegood explanations are hard to vary\textquotesingle in the context of deep learning. We show that averaging gradients across examples -- akin to a logical OR of patterns -- can favor memorization and \textasciigravepatchwork\textquotesingle solutions that sew together different strategies, instead of identifying invariances. To inspect this, we first formalize a notion of consistency for minima of the loss surface, which measures to what extent a minimum appears only when examples are pooled. We then propose and experimentally validate a simple alternative algorithm based on a logical AND, that focuses on invariances and prevents memorization in a set of real-world tasks. Finally, using a synthetic dataset with a clear distinction between invariant and spurious mechanisms, we dissect learning signals and compare this approach to well-established regularizers.}},
  year = {2020},
  slug = {item_3251193},
  author = {Parascandolo, G and Neitz, A and Orvieto, A and Gresele, L and Sch\"olkopf, B}
}

Research

Departments

Research Groups

People

Contact

Our Institute

Our History

Career

Doctoral Programs

Training

Service Units

Central Scientific Facilities

Workshops

Campus Services

Impact

Cooperation

Partners and Initiatives