Stochastic mutual information gradient estimation for dimensionality reduction networks

Ozan Özdenizci, Deniz Erdogmus

Research output: Contribution to journalArticlepeer-review

Abstract

Feature ranking and selection is a widely used approach in various applications of supervised dimensionality reduction in discriminative machine learning. Nevertheless there exists significant evidence on feature ranking and selection algorithms based on any criterion leading to potentially sub-optimal solutions for class separability. In that regard, we introduce emerging information theoretic feature transformation protocols as an end-to-end neural network training approach. We present a dimensionality reduction network (MMINet) training procedure based on the stochastic estimate of the mutual information gradient. The network projects high-dimensional features onto an output feature space where lower dimensional representations of features carry maximum mutual information with their associated class labels. Furthermore, we formulate the training objective to be estimated non-parametrically with no distributional assumptions. We experimentally evaluate our method with applications to high-dimensional biological data sets, and relate it to conventional feature selection algorithms to form a special case of our approach.
Original languageEnglish
Pages (from-to)298-305
Number of pages8
JournalInformation Sciences
Volume570
DOIs
Publication statusPublished - Sept 2021

Keywords

  • Dimensionality reduction
  • Feature projection
  • Information theoretic learning
  • MMINet
  • Mutual information
  • Neural networks
  • Stochastic gradient estimation

ASJC Scopus subject areas

  • Software
  • Information Systems and Management
  • Artificial Intelligence
  • Theoretical Computer Science
  • Control and Systems Engineering
  • Computer Science Applications

Fields of Expertise

  • Information, Communication & Computing

Fingerprint

Dive into the research topics of 'Stochastic mutual information gradient estimation for dimensionality reduction networks'. Together they form a unique fingerprint.

Cite this