AI RESEARCH ANALYSIS
DXAI: explaining classification by image decomposition
We propose a new way to explain and to visualize neural network classification through a decomposition-based explainable AI (DXAI). Instead of providing an explanation heatmap, our method yields a decomposition of the image into class-agnostic and class-distinct parts, with respect to the data and chosen classifier. Following a fundamental signal processing paradigm of analysis and synthesis, the original image is the sum of the decomposed parts. We thus obtain a radically different way of explaining classification. The class-agnostic part ideally is composed of all image features which do not possess class information, where the class-distinct part is its complementary. This new visualization can be more helpful and informative in certain scenarios, especially when the attributes are dense, global, and additive in nature, for instance, when colors or textures are essential for class distinction. Code is available at https://github.com/dxai2024/dxai.
Executive Impact
Our decomposition-based explainable AI (DXAI) offers unparalleled insights into neural network classification, moving beyond traditional heatmaps to deliver dense, high-resolution, and semantic explanations. Key metrics highlight its superior performance and adaptability across diverse datasets.
Deep Analysis & Enterprise Applications
Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.
Understanding DXAI's Core Concept
Understanding classification reasoning of neural networks is of paramount importance. It can lead to better understanding of the classification process, help in debugging and validation, and also serve as an additional informative output for the user at inference. Hence, the research on the topic of explainable artificial intelligence (XAI) is extensive [2, 11, 13, 17]. The most common way to show the explanation of the network in image classification is by producing an explanation heatmap. This map is a per-pixel indication of the relevance of that pixel to the final classification decision of the network (see Figure 2). In this visualization, the larger the value, the more relevant this pixel is to the classification. These maps can be at some lower resolution, where visualization is done by upsampling, or by using superpixels. In addition, some methods give also negative values, indicating pixels which reduce the confidence for the final classification. The heatmaps themselves do not resemble actual images, and to understand the role of the pixels in a heatmap common practices are either to show the input image and the heatmap side by side, to overlay the heatmap on the image, or (less commonly) to show an image where the brightness of the pixels is weighted by the (normalized) heatmap, as shown in Figure 2. This visual explanation method is mostly adequate when the explanation is spatially sparse. That is, there are just a few small regions in the image which contribute mostly to the classification. However, there are many classification problems in which the explanations are dense in the image domain. This can happen in several scenarios, for instance: 1. The object to be classified spans a large portion of the image domain and contains many diverse features, all contributing to the final classification. 2. A main feature contributing to the classification is color change, which appears throughout the image. 3. The class distinction is based on some global disturbance or statistical change, which spans the entire image domain. In such scenarios, a heatmap is much less informative. Depending on the algorithm, it is either too focused on just a small portion of highly dominant features or it shows uniform large areas spanning most of the image. See some examples in Figure 1. In both cases, we lack a clear constructive explanation for the network’s decision. Furthermore, the basis for using explanations through heatmaps implicitly assumes that some pixels are crucial for the solution while others are either unimportant or only partially important. We argue that there are scenarios where each pixel contains both class identity information and neutral data, necessitating the use of a different method.
The DXAI Image Decomposition Process
In this work, we propose to express XAI by the following image decomposition: x = ψAgnostic + ψDistinct, where ψAgnostic is the class agnostic part of the image, which ideally does not entail information about the class, and ψDistinct is the class distinct part which holds the discriminative information, allowing the classifier to obtain distinction from other classes. We utilize a novel generative AI architecture, inspired by style transfer principles (as detailed in appendix B), to accomplish this. We show this way of explaining classification brings new computational and visualization tools, which, for some cases, are much more natural and informative. Our main contributions are as follows: 1. We present a detailed computational framework to estimate Eq. (1), for a given classifier, training set and classification task. The decomposition is of high resolution, allowing to portray well fine and delicate details. 2. We show, for the first time, class-agnostic images, based on decomposition. This provides new information and insights on the classification problem. 3. The method is fast since results are produced at inference time of generative models (no gradients are computed). 4. We provide extensive examples and experimental data showing the advantages of the method, compared to heatmaps. We also discuss its limitations in detail, including its suitability for specific classification tasks. 5. We present an approach to assessing decomposition quality, addressing concerns related to generative model biases and demonstrating consistent convergence through stability experiments.
Advanced Training & Loss Functions
We list below the essential training procedure and losses. See Figure 4 for an overall diagram of the general architecture, consisting of several style transfer generators, our proposed α-blending procedure, style and quality discriminator and classifier. In order for the first channel to contain CD information, we propose a novel α-blending mechanism. For each batch, a random vector α of length n − 1 is drawn, where each element is uniformly distributed in the range [0, 1]. Two images are then generated during training as follows: x^y = ψ_1^y + Σ α_i−1 ψ_i^y + Σ (1 − α_i−1)ψ_i^~y, x^~y = ψ_1^~y + Σ α_i−1 ψ_i^y + Σ (1 − α_i−1)ψ_i^~y where y is the class of the input image and ~y represents a random alternative class, ~y ≠ y. The proposed method encourages the generators to generate identical images for both components in the sum ψ_i^y ≈ ψ_i^~y, i = 2, .., n, and thus to isolate the distinction between the classes to ψ_1. In the ideal case, where the components in the sum are identical and the distinction is only in ψ_1, we converge to Equation (3). The proposed α-blending method allows a stable and effective training. We note that other alternatives, such as attempting to use norm-based losses, e.g., ||ψ^y − ψ^~y||, often yield degenerate solutions, with ψ^y ≈ 0. Classification Loss: Since a pre-trained classifier is integrated into our system, there is no need to further train it on authentic images. Instead, we leverage its classification and attempt to explain it. We enable the generators to produce images that correspond to the classifier’s predictions through the following loss function: L_class-fake = CrossEntropy(C(G(x^y), S_y^target)). Reconstruction Losses: Our generated image x^y is only an approximation of x, see Eqs. (3), (5). To obtain a good approximation, x^y ≈ x, we use a fidelity measure, based on L1 and L2 norms, d(u, v) = ||u − v||_L1 + ||u − v||_L2. We would also like the style transferred class to be similar to the input image. Thus, the reconstruction loss is with respect to the generated images of both classes, L_rec = d(x, x^y) + d(x, x^~y), where x^y, x^~y are given in (5). In addition to this reconstruction loss, one can use additional constraints on the reconstruction to enhance results. Specifically, we observed challenges in reproducing areas with significant differences between classes. To address this, we incorporated an additional constraint for reconstruction between pixels with a high amplitude in the distinction branch. High amplitude signifies differences between classes due to the additive nature of the model. The proposed loss function is L_dis-rec = d(x ⊙ I, x^y ⊙ I), where I = 1 if |ψ_1^y| > mean([|ψ_1^y|]) and 0 else, and ⊙ denotes the product in elemental terms. Adversarial loss: L_adv = E_x,y [log D_y (x^y)] + E_x,y [log(1 − D_y (G(x^y)))]
DXAI's Explanation Generation Workflow
Enterprise Process Flow
Qualitative and Quantitative Advantages
Qualitative results: In Figs. 7,8,9,10, and 11, we show examples for various datasets. Figure 12 In general, our method is especially effective in cases where the differences between the classes have an additive nature. This occurs in situations where the differences involve textures (head and facial hair in Figure 7), colors (Figure 9), and the presence or absence of details, such as cars (Figure 10), red lips, or heavy eyebrows (Figure 7). Quantitative results: We show here various experiments examining the validity of our algorithm and the applicability of using the proposed method to infer useful classification explanations for various diverse data sets. In Figure 14, we show a linear progression from the original image to the agnostic part by generating the images, Xβ = x - β · ψDistinct, where β∈ {0, 0.25, 0.5, 0.75, 1}. For β = 0, we have the original image and for β = 1 the agnostic part. The average probability vector p(xβ) is depicted, averaged for each class. A single image for each class is shown, illustrating this progression. We see class distinction diminishes. Note we do not obtain precise agnostic images for β = 1 but only an approximation. In Table 1, we show the results of a quantitative experiment comparing our method to other possible XAI decompositions. In this experiment, we follow Equation (13) for β ranging from 0 to 1 in increments of 0.1. As β grows accuracy should drop. We check the area under the curve (AUC) of accuracy vs. β, where lower AUC is better. Since, as far as we know, we are the first to propose DXAI, we obtain a decomposition based on established XAI algorithms, implemented using Captum library [18]. One can produce a decomposition from a heatmap H ≥ 0 by a normalization, having a weight for each pixel, w = H / max(H) ∈ [0, 1], and defining for an image x, ψDistinct := w · x, ψAgnostic := (1 − w) · x. We get a decomposition in the form of Equation (1), such that x = ψDistinct + ψAgnostic, see examples in Figure 13. Table 1 demonstrates that for all data sets our decomposition outperforms all other methods by a considerable margin. This indicates our proposed method has evidently different qualitative properties, such that trivial manipulations of the heatmap cannot generate high-quality class-distinct components. Our algorithm does not provide importance ranking of each pixels, with respect to its contribution for the classification. This is one of its limitations. Thus, we cannot use the standard way to evaluate XAI accuracy by removing pixels gradually based on importance, as done, e.g., in [27]. In some problems, however, a simple ranking can be inferred by our algorithm. In a binary classification problem, when one class is predominantly decided based on the existence of certain lighter pixels, we can use the amplitude of pixels in ψDistinct as a reasonable importance ranking. In this case, standard XAI evaluation can be made. The BraTS dataset [24] is such a case. It contains MRI scans of the human brain used for brain tumor segmentation research. The dataset includes scans from patients with brain tumors, along with expert annotations for tumor regions. In many cases, bright regions indicate evidence of tumors. We divided the dataset into two classes: Images containing tumors and images that do not. Some example results are shown in Figure 11. Table 2 shows the standard AUC evaluation on this set (AUC of accuracy vs. number of pixels removed, ordered by importance). We see our algorithm behaves favorably both qualitatively and quantitatively. An XAI algorithm should naturally depend on the specific classifier C at hand. Different classifiers may yield different class-distinct and class-agnostic parts. We check this is indeed the case in Table 3. We compare our results when C is ResNet18 and when C is a simpler classifier with fewer layers, referred to as “Simple” (yielding less accurate results, details in Table 8). We compare AUC as done for Table 1. Here however, the accuracy graph is computed twice—using each of the above classifiers. We show AUC drops more sharply when C used for obtaining ψDistinct in Equation (13) matches the classifier for computing the AUC. The experiment demonstrates that our map effectively captures the characteristics of the target classifier.
Impact of Design Choices on DXAI Performance
Direct optimization: In the ablation study, we tried the naive method to obtain the CD and CA components. As shortly mentioned above, apparently it’s possible to obtain it by optimization. The idea is to get the agnostic component and subtract it from the original image. The optimization is done as follows: Initiate ψAgnostic = x when x is an image from our data. Then, get the predicted probability of each class by entering the image into the classifier: p = C(ψAgnostic). Next, calculate the KL distance between the obtained distribution and the uniform distribution as follows: D_KL = Σ (1/c) * (log(1/c) − log p_i). Finally, compute iteratively or until convergence: ψAgnostic = ψAgnostic − dt∇ψAgnosticDKL, and the distinct component will be ψDistinct = x−ψAgnostic. We received that the process does converge quickly to images for which the output is approximately a uniform distribution (With D_KL > 10^-6). However, the results are not very informative, as shown in Figure 15. Note that this is a highly nonconvex problem where many local minima are possible. In the optimization case, we get a result which is with very little semantic meaning. It is closer, in some sense, to adversarial attacks. This is in contrast with our method, where the results are based on the entire training set. The generative process learns how to produce features that, on average, will confuse the classifier and therefore are semantic in nature. We believe that this distinction may lead to fruitful future research, also in the case of robust network analysis and defense. Multiple branches: We examined the impact of the number of branches. While decomposing the classified image into several images, we primarily focus on solving only two images—distinct and agnostic. One might question why not use only two branches for the solution, since two branches can be trained to achieve a similar solution. We show that when branches are used for the agnostic part, results are better both in terms of reconstruction and of the generators’ ability to produce images that explain the classifier. For instance, as demonstrated in Figure 16, PSNR decreases when using only two branches. Additionally, the loss L_class-fake, representing the generators’ ability to produce meaningful images of a specific class according to the given classifier, is higher for the entire training with two branches. In other words, the classifier interprets the images less accurately as the desired class, making the images less reliable. The additional generators provide better generation capacity and can be trained in an easier and more stable manner. L_dis-rec contribution: We evaluated the impact of the loss L_dis-rec described in Equation (9) on the reconstruction quality. As explained earlier, we employed it because we observed that reconstruction, especially in areas with differences between the classes, was challenging. We conducted experiments both with and without it. We show that it indeed contributes to the quality of the reconstruction in terms of PSNR, as illustrated in Section 4.4. Figure 17. Stability Experiments: As described above, we utilize image decomposition to address issues in existing XAI methods. To ensure adaptability across classifiers and data types, we employ generative models for image decompositions. However, training with gradient descent does not guarantee convergence, raising concerns about solution consistency across different model initializations. Figure 18. To address these concerns, we conducted experiments (Table 4) to examine if the explanations for classifications (specifically ResNet18) become more consistent throughout training, as indicated by the standard deviation of the explanations for different initializations and datasets. We compared this standard deviation with that of two other algorithms (Grad-CAM[28] and Internal Influence[20]), which can produce different solutions based on hyperparameter choices (e.g., the activation layer of the classifier). We used three different initializations and three sequential layers to calculate the standard deviation. Since pixel values vary between methods, we normalized the standard deviation of each image within the dynamic range of the solutions, allowing for a fair evaluation. Our experiments showed that the standard deviation decreases with training, indicating convergence toward a consistent solution. Additionally, our standard deviation is relatively small compared to the alternatives.
Challenges and Future Directions for DXAI
We propose an alternative way to analyze and to visualize the reasons for classification by neural networks. It is based on decomposing the image into a part which does not contribute to the classification and to one which holds the class-related cues. This approach may not be ideal for all applications and has several limitations and drawbacks, compared to standard XAI methods: 1. The method requires training, for a specific training set and classifier. Training is slow and inference is fast; therefore, it will be more suitable for cases where a large number of images need to be classified with a trained classifier.; 2. There is no natural ranking of the significance of pixels in the image. The amplitude of pixels in the class-distinct part can serve as a good approximation; 3. Although we have shown examples on several data sets of diverse nature, in some cases, the method is not always suitable to present the explanation of the network classification in an understandable way, for example, when the difference between the classes is not easily decomposed additively. We present such a case and analyze it in detail in appendix C; 4. Our implementation uses GANs. The proposed concept does not rely on a GAN architecture, and improvements may be achieved by diffusion-type generative models, such as [16, 32, 36] and [7].
Calculate Your Potential AI ROI
Estimate the tangible benefits of integrating advanced AI explanations into your enterprise workflows. See how DXAI can translate into significant operational efficiencies and cost savings.
Your DXAI Implementation Roadmap
A structured approach to integrating DXAI into your existing systems, ensuring a seamless transition and maximizing return on investment.
Problem Formulation & Initial Architecture
Establish the core DXAI problem definition: decomposing images into class-agnostic and class-distinct parts. Develop a novel generative AI architecture inspired by style transfer principles, integrating multiple generators and a multi-head discriminator.
Training Procedure & Loss Function Development
Design the training process, including the α-blending mechanism for isolating class-distinct information. Define essential loss functions: classification loss for aligning generator outputs with classifier predictions, reconstruction losses (L1/L2 and L_dis-rec) for fidelity, and adversarial loss for image realism.
Inference Stage Implementation & Evaluation
Implement the inference stage where the trained DXAI model generates class-distinct and class-agnostic maps. Conduct extensive qualitative and quantitative experiments across diverse datasets (AFHQ, CelebA, DOTA, BraTS, Peppers, Tomatoes, Apples) to validate performance against existing XAI methods.
Ablation Studies & Stability Analysis
Perform ablation studies to understand the impact of design choices (e.g., number of branches, L_dis-rec contribution) on reconstruction quality and classification explanation. Conduct stability experiments to assess consistency of solutions across different model initializations, demonstrating robustness.
Documentation & Future Work
Document the computational framework, architectural details, and experimental findings. Discuss limitations, particularly for non-additive classification cases, and identify future research directions including the potential for diffusion-type generative models.
Ready to Transform Your Enterprise with AI?
Leverage the power of explainable AI to drive clarity, trust, and innovation in your organization. Book a free consultation to explore how DXAI can be tailored to your specific business needs.