Figure: Overview of three distinct basic patterns of k* Distribution.
Key Contributions:
Identification of various distribution patterns of samples in the latent space based on neighborhood characteristics
A model-agnostic latent space analysis of neural networks, focusing on samples from a single class
A method for straightforwardly comparing different classes and understanding how samples from various classes are distributed in the learned latent space
Abstract
Most examinations of neural networks' learned latent spaces typically employ dimensionality reduction techniques such as t-SNE or UMAP.
While these methods effectively capture the overall sample distribution in the entire learned latent space, they tend to distort the structure of sample distributions within specific classes in the subset of the latent space.
This distortion complicates the task of easily distinguishing classes identifiable by neural networks.
In response to this challenge, we introduce the k* Distribution methodology.
This approach focuses on capturing the characteristics and structure of sample distributions for individual classes within the subset of the learned latent space using local neighborhood analysis.
The key concept is to facilitate easy comparison of different k* distributions, enabling analysis of how various classes are processed by the same neural network.
This provides a more profound understanding of existing contemporary visualizations.
Our study reveals three distinct distributions of samples within the learned latent space subset:
a) Fractured, b) Overlapped, and c) Clustered.
We note and demonstrate that the distribution of samples within the network's learned latent space significantly varies depending on the class.
Furthermore, we illustrate that our analysis can be applied to explore the latent space of
diverse neural network architectures,
various layers within neural networks,
transformations applied to input samples,
and the distribution of training and testing data
for neural networks.
We anticipate that our approach will facilitate more targeted investigations into neural networks by collectively examining the distribution of different samples within the learned latent space.
Method
Figure: Overview of the framework to create k* Distribution. We use the learned features of a neural network to compute k* values of individual evaluated sample and then compute the k* distribution for a particular class.
Figure: Illustration of calculating k* value of a sample and correspondingly k* distribution of class.
Visualizations of Latent Space
ResNet-50
Figure: Visualization of the distribution of samples in latent space using, k* distribution, and Dimensionality Reduction techniques like t-SNE, Isomap, PCA, and UMAP of all classes of 16-class-ImageNet for the Logit Layer of ResNet-50 Architecture.
ResNeXt-101
Figure: Visualization of the distribution of samples in latent space using, k* distribution, and Dimensionality Reduction techniques like t-SNE, Isomap, PCA, and UMAP of all classes of 16-class-ImageNet for the Logit Layer of ResNet-50 Architecture.
EfficientNet-B0
Figure: Visualization of the distribution of samples in latent space using, k* distribution, and Dimensionality Reduction techniques like t-SNE, Isomap, PCA, and UMAP of all classes of 16-class-ImageNet for the Logit Layer of ResNet-50 Architecture.
ViT-Base
Figure: Visualization of the distribution of samples in latent space using, k* distribution, and Dimensionality Reduction techniques like t-SNE, Isomap, PCA, and UMAP of all classes of 16-class-ImageNet for the Logit Layer of ResNet-50 Architecture.
Logit Layer of ResNet-50
Figure: Visualization of the distribution of samples in latent space using, k* distribution, and Dimensionality Reduction techniques like t-SNE, Isomap, PCA, and UMAP of all classes of 16-class-ImageNet for the Logit Layer of ResNet-50 Architecture.
Average Pooling Layer of ResNet-50
Figure: Visualization of the distribution of samples in latent space using, k* distribution, and Dimensionality Reduction techniques like t-SNE, Isomap, PCA, and UMAP of all classes of 16-class-ImageNet for the Logit Layer of ResNet-50 Architecture.
ResNet-50 Trained with ImageNet-1k
Figure: Visualization of the distribution of samples in latent space using, k* distribution, and Dimensionality Reduction techniques like t-SNE, Isomap, PCA, and UMAP of all classes of 16-class-ImageNet for the Logit Layer of ResNet-50 Architecture.
ResNet-50 Trained with Stylized ImageNet-1k
Figure: Visualization of the distribution of samples in latent space using, k* distribution, and Dimensionality Reduction techniques like t-SNE, Isomap, PCA, and UMAP of all classes of 16-class-ImageNet for the Logit Layer of ResNet-50 Architecture.
ResNet-50 Trained with ImageNet1k + Stylized ImageNet-1k
Figure: Visualization of the distribution of samples in latent space using, k* distribution, and Dimensionality Reduction techniques like t-SNE, Isomap, PCA, and UMAP of all classes of 16-class-ImageNet for the Logit Layer of ResNet-50 Architecture.
ResNet-50 without Adversarial Training
Figure: Visualization of the distribution of samples in latent space using, k* distribution, and Dimensionality Reduction techniques like t-SNE, Isomap, PCA, and UMAP of all classes of 16-class-ImageNet for the Logit Layer of ResNet-50 Architecture.
ResNet-50 with Adversarial Training
Figure: Visualization of the distribution of samples in latent space using, k* distribution, and Dimensionality Reduction techniques like t-SNE, Isomap, PCA, and UMAP of all classes of 16-class-ImageNet for the Logit Layer of ResNet-50 Architecture.
Bibtex
@article{kotyan2024kdistribution,
title = {k* {{Distribution}}: {{Evaluating}} the {{Latent Space}} of {{Deep Neural Networks}} Using {{Local Neighborhood Analysis}}},
shorttitle = {k* {{Distribution}}},
author = {Kotyan, Shashank and Tatsuya, Ueda and Vargas, Danilo Vasconcellos},
year = {2023},
month = dec,
number = {arXiv:2312.04024},
eprint = {2312.04024},
publisher = {{arXiv}},
doi = {10.48550/arXiv.2312.04024},
}