k* Distribution: Evaluating the Latent Space of Deep Neural Networks using Local Neighborhood Analysis

Figure: Overview of three distinct basic patterns of k* Distribution.

Key Contributions:

Identification of various distribution patterns of samples in the latent space based on neighborhood characteristics
A model-agnostic latent space analysis of neural networks, focusing on samples from a single class
A method for straightforwardly comparing different classes and understanding how samples from various classes are distributed in the learned latent space

Abstract

Most examinations of neural networks' learned latent spaces typically employ dimensionality reduction techniques such as t-SNE or UMAP. While these methods effectively capture the overall sample distribution in the entire learned latent space, they tend to distort the structure of sample distributions within specific classes in the subset of the latent space. This distortion complicates the task of easily distinguishing classes identifiable by neural networks. In response to this challenge, we introduce the k* Distribution methodology. This approach focuses on capturing the characteristics and structure of sample distributions for individual classes within the subset of the learned latent space using local neighborhood analysis. The key concept is to facilitate easy comparison of different k* distributions, enabling analysis of how various classes are processed by the same neural network. This provides a more profound understanding of existing contemporary visualizations. Our study reveals three distinct distributions of samples within the learned latent space subset: a) Fractured, b) Overlapped, and c) Clustered. We note and demonstrate that the distribution of samples within the network's learned latent space significantly varies depending on the class. Furthermore, we illustrate that our analysis can be applied to explore the latent space of diverse neural network architectures, various layers within neural networks, transformations applied to input samples, and the distribution of training and testing data for neural networks. We anticipate that our approach will facilitate more targeted investigations into neural networks by collectively examining the distribution of different samples within the learned latent space.

Method

Figure: Overview of the framework to create k* Distribution. We use the learned features of a neural network to compute k* values of individual evaluated sample and then compute the k* distribution for a particular class.

Figure: Illustration of calculating k* value of a sample and correspondingly k* distribution of class.

Visualizations of Latent Space

ResNet-50

Figure: Visualization of the distribution of samples in latent space using, k* distribution, and Dimensionality Reduction techniques like t-SNE, Isomap, PCA, and UMAP of all classes of 16-class-ImageNet for the Logit Layer of ResNet-50 Architecture.

ResNeXt-101

EfficientNet-B0

ViT-Base

Logit Layer of ResNet-50

Average Pooling Layer of ResNet-50

ResNet-50 Trained with ImageNet-1k

ResNet-50 Trained with Stylized ImageNet-1k

ResNet-50 Trained with ImageNet1k + Stylized ImageNet-1k

ResNet-50 without Adversarial Training

ResNet-50 with Adversarial Training

Bibtex

@article{kotyan2024kdistribution,
  title={{{k* Distribution}}: Evaluating the {{Latent Space}} of {{Deep Neural Networks}} Using {{Local Neighborhood Analysis}}},
  shorttitle = {{{k* Distribution}}},
  author={Kotyan, Shashank and Ueda, Tatsuya and Vargas, Danilo Vasconcellos},
  journal={IEEE Transactions on Neural Networks and Learning Systems},
  year={2024},
  publisher={IEEE}
}