A geometric perspective on information plane analysis

Mina Basirat, Bernhard C. Geiger, Peter M. Roth*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Information plane analysis, describing the mutual information between the input and a hidden layer and between a hidden layer and the target over time, has recently been proposed to analyze the training of neural networks. Since the activations of a hidden layer are typically continuous-valued, this mutual information cannot be computed analytically and must thus be estimated, resulting in apparently inconsistent or even contradicting results in the literature. The goal of this paper is to demonstrate how information plane analysis can still be a valuable tool for analyzing neural network training. To this end, we complement the prevailing binning estimator for mutual information with a geometric interpretation. With this geometric interpretation in mind, we evaluate the impact of regularization and interpret phenomena such as underfitting and overfitting. In addition, we investigate neural network learning in the presence of noisy data and noisy labels.

Original languageEnglish
Article number711
JournalEntropy
Volume23
Issue number6
DOIs
Publication statusPublished - Jun 2021

Keywords

  • Adaptive and fixed binning
  • Image classification
  • Information plane analysis
  • Neural networks

ASJC Scopus subject areas

  • Information Systems
  • Mathematical Physics
  • Physics and Astronomy (miscellaneous)
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'A geometric perspective on information plane analysis'. Together they form a unique fingerprint.

Cite this