Computation and Data-Driven Discovery (C3D) Projects
Streaming Visualization for Performance Anomaly Detection
Machine learning (ML), especially deep learning (DL), has inspired a technical revolution with profound impacts on almost every application and research domain. However, more widespread adoption of this new technique is hindered by the lack of model interpretability and transparency. Thus, explainable artificial intelligence (xAI) has become an important research topic, while a variety of methods and tools have been proposed to “open the black box” and make AI models more explainable, trustworthy, and controllable. Either interactive visualization systems or individual explanation building blocks are presented. In this project, we are adopting a number of innovative approaches, addressing both perspectives.
VisMAL: This work presents a visualization system designed for domain scientists to visually understand their deep learning model (convolutional neural networks, or CNNs) of extracting multiple attributes in x-ray scattering images. The system focuses on studying the model behaviors related to multiple structural attributes. It allows users to explore the images in the feature space, the classification output of different attributes, with respect to the actual attributes labeled by domain scientists. Abundant interactions allow users to flexibly select instance images, their clusters, and compare them visually in detail.
VisLRP: Layer-wise Relevance Propagation (LRP) methods are widely used to explain deep neural networks (DNNs), especially in the computer vision field for interpreting the prediction results of CNNs. Multiple LRP variations use a set of relevance backpropagation rules with various parameters. Moreover, composite LRPs apply different rules on segments of CNN layers. These features impose significant challenge for users to design, explore, and find suitable LRP models. In this work, we are developing a visual model designer, named VisLRP, which helps LRP designers and students efficiently perform these tasks.
Polarized LRP for GAN: This preliminary research focuses on understanding the behaviors of one of the network’s major components, the Discriminator, which plays a vital role but often is overlooked. Specifically, we propose an enhanced LRP algorithm, called Polarized-LRP. It generates a heatmap-based visualization, highlighting the area in the input image that contributes to the network decision. It consists of two parts: a positive contribution heatmap for the images classified as “ground truth” and a negative contribution heatmap for the ones classified as “generated.” As a use case, we have chosen the field of astronomy, specifically the deblending of two overlapping galaxy images via a branched generative adversarial network (GAN) model. Using the Galaxy Zoo dataset, we demonstrate that our method clearly reveals the attention areas of the Discriminator to differentiate generated galaxy images from ground truth images and outperforms the original LRP method. To connect the Discriminator’s impact on the Generator, we also visualize the attention shift of the Generator across the training process.
Feature Importance Methods for a Climate Surrogate Model This study uses a class of post hoc local explanation methods, i.e., feature importance methods for “understanding”' a deep learning emulator of climate. Specifically, we consider a multiple-input/single-output emulator that uses a DenseNet encoder-decoder architecture and is trained to predict interannual variations of sea surface temperature (SST) at one-, six-, and nine-month lead times using the preceding 36 months of (appropriately filtered) SST data. First, feature importance methods are employed for individual predictions to spatiotemporally identify input features that are important for model prediction at chosen geographical regions and prediction lead times. The second step examines the behavior of feature importance in a generalized way by considering an aggregation of the importance heatmaps over training samples. An analysis showed that: 1) the climate emulator’s prediction at any geographical location depends dominantly on a small neighborhood around it; 2) the longer the prediction lead time, the further back the “importance'” extends; and 3) to leading order, the temporal decay of “importance” is independent of geographical location. An ablation experiment has been adopted to verify these findings. From the perspective of climate dynamics, these results suggest a dominant role for local processes and a negligible one for remote teleconnections at the spatial and temporal scales to consider. From the perspective of network architecture, the spatiotemporal relations between the inputs and outputs suggest potential model refinements are needed.