Sound Reasoning: Testing How Well AI Explains Anomaly Detection

Author: Denis Avetisyan


A new framework rigorously evaluates whether explanations generated by Explainable AI methods truly reflect how machine listening models identify unusual sounds.

A novel framework facilitates rigorous evaluation of explainable artificial intelligence (XAI) methods specifically within the challenging domain of machine audio anomaly detection.
A novel framework facilitates rigorous evaluation of explainable artificial intelligence (XAI) methods specifically within the challenging domain of machine audio anomaly detection.

Researchers propose a quantitative approach using frequency-band perturbation to assess the faithfulness of XAI techniques for machine anomaly detection.

Despite the increasing use of Explainable AI (XAI) to interpret machine learning models for anomalous sound detection, a robust method for verifying the faithfulness of these explanations has remained elusive. This work introduces ‘A Framework for Evaluating Faithfulness in Explainable AI for Machine Anomalous Sound Detection Using Frequency-Band Perturbation’, a quantitative approach that links XAI attributions to model behavior through systematic frequency removal. Our findings reveal substantial variation in the reliability of popular XAI techniques – notably, Occlusion demonstrates stronger alignment with true model sensitivity compared to gradient-based methods. Does this suggest a need to re-evaluate the trustworthiness of current XAI practices in spectrogram-based audio analysis and prioritize perturbation-based techniques for faithful explanations?


Discerning Signal from Noise: The Foundation of Anomaly Detection

The core of anomaly detection lies in discerning unusual patterns indicative of machine malfunction or atypical behavior, a process frequently hampered by a scarcity of labeled data. Unlike supervised learning, where algorithms learn from explicitly categorized examples, anomaly detection often operates with predominantly normal operational data, requiring the system to infer what constitutes ‘deviation’. This presents a significant challenge, as defining the boundaries of normal operation can be complex and subtle; a slight variance might be acceptable within typical use, while another signals critical failure. Consequently, algorithms must be adept at identifying these deviations without prior knowledge of anomalous instances, relying instead on understanding the inherent structure and distribution of normal data to flag anything that doesn’t conform-a task demanding both statistical rigor and a nuanced understanding of the underlying machine processes.

Conventional anomaly detection methods often falter when applied to the intricate soundscapes of functioning machinery. Real-world machine sounds are rarely pristine; they are frequently a complex mixture of normal operational noises, environmental interference, and the subtle precursors to developing faults. These subtle anomalies – a slight increase in bearing friction, a minor imbalance in a rotating component – can be masked by the sheer volume of regular operational sounds, or easily mistaken for typical variations. Traditional algorithms, frequently relying on pre-defined thresholds or simplified models, struggle to discern these nuanced deviations, leading to high rates of false positives or, more critically, missed detections of impending failures. The inherent complexity and variability of these acoustic signals demand more sophisticated approaches capable of capturing the subtle features indicative of genuine anomalies, rather than being misled by the inherent noise and richness of the machine’s sonic fingerprint.

The efficacy of anomaly detection hinges critically on the quality of the features used to represent machine data; subtle deviations indicative of failure are easily obscured by irrelevant variations if the underlying representation is poor. Consequently, researchers are increasingly turning to self-supervised learning as a means of automatically discovering robust and informative features from unlabeled data. These techniques allow models to learn meaningful representations by solving pretext tasks – such as predicting masked portions of the input signal or its future evolution – without requiring manual labeling. By learning to understand the inherent structure of normal machine operation, self-supervised learning provides a powerful foundation for identifying anomalies as deviations from this learned normality, offering a promising pathway towards more reliable and adaptable anomaly detection systems in complex industrial environments.

Area under the curve (AUC) analysis reveals performance variations across machines when utilizing different frequency bands on the evaluation subset.
Area under the curve (AUC) analysis reveals performance variations across machines when utilizing different frequency bands on the evaluation subset.

Unveiling Patterns in Time and Frequency: A Self-Supervised Approach

The spectrogram is a visual representation of the frequencies present in a signal as it varies with time. It’s generated by performing a Short-Time Fourier Transform (STFT) on the audio signal, effectively decomposing it into its constituent frequencies at different points in time. The resulting image displays time on the horizontal axis, frequency on the vertical axis, and signal amplitude or intensity represented by color or grayscale. For machine sound analysis, this time-frequency representation is crucial because many machine faults manifest as changes in the frequency content or temporal patterns of emitted sounds; for example, a bearing failure might introduce specific harmonic frequencies or intermittent noise bursts detectable in the spectrogram. Analyzing these patterns allows for the identification of anomalies and facilitates automated fault diagnosis.

Self-supervised learning addresses the limitations of labeled datasets in anomaly detection by enabling models to learn representations from unlabeled data. Techniques such as Feature Exchange (FeatEx) operate by masking portions of the input signal and training the model to predict the missing information based on the remaining context. This process forces the model to develop a robust understanding of the underlying data distribution and learn discriminative embeddings – vector representations that capture essential features – without requiring manually annotated anomalous examples. The learned embeddings can then be used for downstream tasks like anomaly classification or localization, effectively leveraging the abundance of unlabeled data to improve performance and reduce reliance on scarce labeled data.

The DCASE2023 Task 2 dataset serves as a standardized benchmark for evaluating anomalous sound detection methodologies. This dataset comprises diverse audio recordings, including both normal machine operation sounds and a variety of anomalous events, such as squeals, impacts, and rubs. It is structured to facilitate both supervised and self-supervised learning approaches, providing labeled data for training and testing, as well as unlabeled data for pre-training. Performance is typically measured using area under the receiver operating characteristic curve (AUC-ROC) and equal error rate (EER), allowing for quantitative comparison of different algorithms and configurations. The dataset’s complexity and realism contribute to its value in assessing the robustness and generalizability of developed systems in real-world industrial settings.

XAI methods-including Integrated Gradients, Occlusion, Grad-CAM, and SmoothGrad-successfully highlight relevant regions (purple) within audio spectrograms, demonstrating their ability to identify key features driving model decisions.
XAI methods-including Integrated Gradients, Occlusion, Grad-CAM, and SmoothGrad-successfully highlight relevant regions (purple) within audio spectrograms, demonstrating their ability to identify key features driving model decisions.

Illuminating the ‘Why’: Dissecting Model Decisions with Explainable AI

Explainable AI (XAI) techniques address the need to understand the rationale behind machine learning model predictions, specifically by identifying which input time-frequency components most influence the output. This attribution process moves beyond simply knowing what a model predicts to understanding why it made that prediction. By decomposing the input signal into its constituent time-frequency representations – often utilizing techniques like Short-Time Fourier Transform (STFT) or wavelet transforms – XAI methods can pinpoint the specific spectral and temporal features driving the decision. This increased transparency is critical for building trust in model outputs, particularly in applications where decisions impact critical systems or human lives, and facilitates debugging and refinement of the model itself by revealing potential biases or unintended sensitivities.

Several Explainable AI (XAI) techniques offer varied approaches to determine feature importance in model decisions. Integrated Gradients calculates the integral of gradients along a path from a baseline input to the input of interest, attributing prediction difference to each feature. Occlusion methods systematically mask portions of the input and observe the resulting change in model output, identifying crucial regions. SmoothGrad averages gradients over multiple noisy samples, reducing noise and providing a clearer importance map. Grad-CAM, or Gradient-weighted Class Activation Mapping, utilizes the gradients flowing into the final convolutional layer to create a heatmap highlighting the input regions most relevant to a specific class. Each method assesses feature importance through a unique calculation, offering complementary insights into model behavior and enabling a more comprehensive understanding of decision-making processes.

Frequency-Band Removal Analysis is a sensitivity testing method used to determine the impact of specific frequency components on model performance. This technique involves systematically removing defined frequency bands from the input data – typically using band-stop filters – and then evaluating the resulting change in the model’s predictive accuracy or other relevant metrics. By quantifying the performance degradation associated with the removal of each frequency band, researchers can identify the frequency ranges most critical to the model’s decision-making process. The magnitude of performance change after band removal serves as a direct indicator of the model’s sensitivity to that specific frequency range, allowing for a granular understanding of feature importance beyond simple overall accuracy scores.

Area under the curve (AUC) results demonstrate performance variations across machines when utilizing different frequency bands on the development dataset.
Area under the curve (AUC) results demonstrate performance variations across machines when utilizing different frequency bands on the development dataset.

Validating Interpretability: Assessing Explanation Faithfulness and Machine Specificity

Determining the fidelity of an explanation method is paramount in understanding whether it genuinely reflects the reasoning behind a model’s decisions. Attribution methods aim to highlight the features most influential in a given prediction, but simply generating these explanations isn’t enough; it’s crucial to validate their accuracy. A faithful explanation aligns closely with the model’s internal decision-making process, meaning the features identified as important by the method are, in reality, the ones driving the model’s output. Without this validation, explanations risk being misleading artifacts, offering a false sense of understanding and potentially leading to incorrect interpretations or flawed reliance on the model’s behavior. Rigorous faithfulness evaluation, therefore, serves as a critical step in building trust and ensuring responsible application of machine learning models.

To rigorously assess the validity of model explanations, researchers employ Spearman Rank Correlation as a key metric. This statistical method quantifies the extent to which the ranking of features, as determined by an explanation technique, aligns with the actual importance of those features in driving model decisions. Unlike methods requiring a strict linear relationship, Spearman Correlation focuses on whether the order of feature importance is preserved – a crucial aspect when evaluating the faithfulness of an explanation. A high correlation indicates that the explanation accurately identifies the most influential features, even if it doesn’t perfectly quantify their precise impact. This approach provides a robust and interpretable measure of how well an explanation reflects the underlying reasoning process of the machine learning model, moving beyond simple accuracy metrics to assess the quality of the insight provided.

Investigations into anomaly detection across diverse machines revealed a nuanced relationship between model reliance and specific frequency bands within the analyzed data. The study demonstrated that each machine exhibited a unique sensitivity profile; some prioritized lower frequencies for identifying deviations, while others heavily weighted higher frequency components. This machine-specific frequency sensitivity suggests that anomaly detection models don’t universally interpret data in the same way, and their internal logic is shaped by the particular characteristics of the hardware and software configuration of each individual machine. Understanding these variations is crucial for building robust and generalizable anomaly detection systems, and for accurately interpreting the explanations generated by XAI techniques applied to these models.

The study rigorously evaluated the faithfulness of several explainable artificial intelligence (XAI) methods, revealing that Occlusion consistently provided the most accurate reflection of a model’s internal decision-making process. Researchers quantified this alignment by measuring the correlation between attribution scores – which indicate feature importance – and the actual sensitivity of the model to the removal of specific frequency bands. Occlusion achieved an overall correlation of 0.884, significantly surpassing Integrated Gradients (0.530) and SmoothGrad (0.400). This high correlation suggests Occlusion effectively identifies the frequency components that genuinely drive the model’s anomaly detection capabilities, offering a robust and reliable means of understanding its behavior and fostering greater trust in its predictions.

Analysis of explanation faithfulness revealed nuanced performance among different attribution methods. While Integrated Gradients demonstrated a moderate, yet discernible, correlation of 0.530 between attribution scores and actual feature importance, SmoothGrad exhibited a substantially lower correlation, registering at just 0.400. This suggests that SmoothGrad’s generated explanations are less reliable in reflecting the true reasoning process of the model under investigation, potentially highlighting its sensitivity to noise or its inability to accurately capture complex feature interactions. The considerable difference in correlation underscores the importance of rigorous evaluation when selecting an explainable AI technique, as not all methods consistently align with the model’s internal decision-making logic.

The pursuit of robust anomaly detection, as detailed in the framework presented, demands a rigorous assessment of not just performance, but also faithfulness. This echoes Alan Turing’s sentiment: “Sometimes people who are unaware of their own incompetence accomplish more than those who are fully aware of it.” The study highlights how seemingly effective XAI methods can be misleading, failing to accurately reflect the model’s reliance on specific frequency bands – a crucial element in machine listening. Evaluating faithfulness through perturbation, particularly Occlusion, offers a means to discern true algorithmic behavior, moving beyond superficial explanations and toward provable, scalable solutions that mirror mathematical purity. This commitment to verifiable results, rather than merely ‘working’ outputs, aligns perfectly with a dedication to demonstrable algorithmic correctness.

What’s Next?

The pursuit of ‘explainable’ artificial intelligence too often resembles alchemy-a conjuring of justifications after the fact. This work, by subjecting XAI techniques for machine listening to rigorous quantitative scrutiny via frequency-band perturbation, begins to dismantle that illusion. The finding that perturbation-based methods, specifically Occlusion, demonstrate greater faithfulness-that is, a closer correspondence between explanation and actual model reasoning-is not a triumph, but a necessary first step. It clarifies that many existing explanation methods are, at best, post-hoc rationalizations, and at worst, misleading artifacts.

However, faithfulness is not perfection. Establishing that an explanation reflects the model’s internal process does not guarantee that process is sound. A consistently wrong model, faithfully explained, remains consistently wrong. Future work must move beyond merely verifying correspondence, and begin to assess the mathematical validity of the model’s decision boundaries. What fundamental properties of the acoustic signal are being correctly – or incorrectly – identified?

In the chaos of data, only mathematical discipline endures. The next generation of XAI will not ask ‘what does the model say it is doing?’, but ‘can the model’s behaviour be proven correct, given a defined set of acoustic principles?’. Only then will ‘explainability’ transcend mere rhetoric and become a genuine tool for scientific understanding.


Original article: https://arxiv.org/pdf/2601.19017.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

See also:

2026-01-28 20:05