k* Distribution: Evaluating the Latent Space of Deep Neural Networks using Local Neighborhood Analysis

Shashank Kotyan * Ueda Tatsuya * Danilo Vasconcellos Vargas

Kyushu University
* Equal Contribution



Overview of three distinct basic patterns of k* Distribution.
Figure: Overview of three distinct basic patterns of k* Distribution.

Key Contributions:


Abstract

Most examinations of neural networks' learned latent spaces typically employ dimensionality reduction techniques such as t-SNE or UMAP. While these methods effectively capture the overall sample distribution in the entire learned latent space, they tend to distort the structure of sample distributions within specific classes in the subset of the latent space. This distortion complicates the task of easily distinguishing classes identifiable by neural networks. In response to this challenge, we introduce the k* Distribution methodology. This approach focuses on capturing the characteristics and structure of sample distributions for individual classes within the subset of the learned latent space using local neighborhood analysis. The key concept is to facilitate easy comparison of different k* distributions, enabling analysis of how various classes are processed by the same neural network. This provides a more profound understanding of existing contemporary visualizations. Our study reveals three distinct distributions of samples within the learned latent space subset: a) Fractured, b) Overlapped, and c) Clustered. We note and demonstrate that the distribution of samples within the network's learned latent space significantly varies depending on the class. Furthermore, we illustrate that our analysis can be applied to explore the latent space of diverse neural network architectures, various layers within neural networks, transformations applied to input samples, and the distribution of training and testing data for neural networks. We anticipate that our approach will facilitate more targeted investigations into neural networks by collectively examining the distribution of different samples within the learned latent space.




Method

Figure: Overview of the framework to create k* Distribution. We use the learned features of a neural network to compute k* values of individual evaluated sample and then compute the k* distribution for a particular class.

Figure: Illustration of calculating k* value of a sample and correspondingly k* distribution of class.



Visualizations of Latent Space



ResNet-50

Figure: Visualization of the distribution of samples in latent space using, k* distribution, and Dimensionality Reduction techniques like t-SNE, Isomap, PCA, and UMAP of all classes of 16-class-ImageNet for the Logit Layer of ResNet-50 Architecture.


ResNeXt-101

Figure: Visualization of the distribution of samples in latent space using, k* distribution, and Dimensionality Reduction techniques like t-SNE, Isomap, PCA, and UMAP of all classes of 16-class-ImageNet for the Logit Layer of ResNet-50 Architecture.


EfficientNet-B0

Figure: Visualization of the distribution of samples in latent space using, k* distribution, and Dimensionality Reduction techniques like t-SNE, Isomap, PCA, and UMAP of all classes of 16-class-ImageNet for the Logit Layer of ResNet-50 Architecture.


ViT-Base

Figure: Visualization of the distribution of samples in latent space using, k* distribution, and Dimensionality Reduction techniques like t-SNE, Isomap, PCA, and UMAP of all classes of 16-class-ImageNet for the Logit Layer of ResNet-50 Architecture.


Logit Layer of ResNet-50

Figure: Visualization of the distribution of samples in latent space using, k* distribution, and Dimensionality Reduction techniques like t-SNE, Isomap, PCA, and UMAP of all classes of 16-class-ImageNet for the Logit Layer of ResNet-50 Architecture.


Average Pooling Layer of ResNet-50

Figure: Visualization of the distribution of samples in latent space using, k* distribution, and Dimensionality Reduction techniques like t-SNE, Isomap, PCA, and UMAP of all classes of 16-class-ImageNet for the Logit Layer of ResNet-50 Architecture.


ResNet-50 Trained with ImageNet-1k

Figure: Visualization of the distribution of samples in latent space using, k* distribution, and Dimensionality Reduction techniques like t-SNE, Isomap, PCA, and UMAP of all classes of 16-class-ImageNet for the Logit Layer of ResNet-50 Architecture.


ResNet-50 Trained with Stylized ImageNet-1k

Figure: Visualization of the distribution of samples in latent space using, k* distribution, and Dimensionality Reduction techniques like t-SNE, Isomap, PCA, and UMAP of all classes of 16-class-ImageNet for the Logit Layer of ResNet-50 Architecture.


ResNet-50 Trained with ImageNet1k + Stylized ImageNet-1k

Figure: Visualization of the distribution of samples in latent space using, k* distribution, and Dimensionality Reduction techniques like t-SNE, Isomap, PCA, and UMAP of all classes of 16-class-ImageNet for the Logit Layer of ResNet-50 Architecture.


ResNet-50 without Adversarial Training

Figure: Visualization of the distribution of samples in latent space using, k* distribution, and Dimensionality Reduction techniques like t-SNE, Isomap, PCA, and UMAP of all classes of 16-class-ImageNet for the Logit Layer of ResNet-50 Architecture.


ResNet-50 with Adversarial Training

Figure: Visualization of the distribution of samples in latent space using, k* distribution, and Dimensionality Reduction techniques like t-SNE, Isomap, PCA, and UMAP of all classes of 16-class-ImageNet for the Logit Layer of ResNet-50 Architecture.




Bibtex

@article{kotyan2024kdistribution,
    title = {k* {{Distribution}}: {{Evaluating}} the {{Latent Space}} of {{Deep Neural Networks}} Using {{Local Neighborhood Analysis}}},
    shorttitle = {k* {{Distribution}}},
    author = {Kotyan, Shashank and Tatsuya, Ueda and Vargas, Danilo Vasconcellos},
    year = {2023},
    month = dec,
    number = {arXiv:2312.04024},
    eprint = {2312.04024},
    publisher = {{arXiv}},
    doi = {10.48550/arXiv.2312.04024},
}