Advances in Interpretable Machine Learning : Applications, Analytics, Robustness

Ajalloeian, Ahmad

Abstract

In recent years, there has been a notable surge in the adoption of complex and inher- ently opaque Machine Learning (ML) models across diverse domains and industries. This trend has extended to high-stakes sectors like healthcare, loan eligibility, hiring, criminal risk assessment, and self-driving vehicles. However, incorporating ML- driven decisions in such critical contexts bears profound implications for human lives. The opacity inherent in complex ML model outputs raises substantial concerns encompassing security, ethics, robustness, and comprehensibility from a scientific perspective. As complex models such as deep neural networks (DNN) become instrumental in automated decision-making directly affecting individuals, it becomes imperative to cultivate tools and methodologies that facilitate a comprehensive grasp of these models’ functioning. This is essential to prevent undesirable outcomes, including propagating societal biases and engendering erroneous predictions. In re- sponse to these imperatives, the domain of Interpretable Machine Learning (IML) emerges, aiming to address the aforementioned concerns associated with ML models. It seeks to explain both the average behavior and specific predictions of ML models, equipping researchers and practitioners with insights into the complex mechanisms governing model predictions. In this thesis, we investigate interpretability methods both in the context of classical ML and for explaining the prediction of deep neural networks.
In this thesis, we begin by focusing on data dimensionality reduction techniques,
which are vital for understanding large-scale high-dimensional datasets, and Introduce MoDE, a novel technique that generates interpretable, low-dimensional visualizations for big datasets. Going beyond preserving inter-data point distances, MoDE also maintains correlations and objects’ ordinal scores. This unique ability, preserving ordinal scores, enables MoDE to offer interpretable visualizations of high-dimensional data within reduced dimensions, enhancing data understanding. Furthermore, our comprehensive analysis and empirical assessments highlight MoDE’s computational efficiency, confirming its practicality.
We proceed by examining explanation methods employed for interpreting deep neural networks (DNNs). Our analysis assesses how these methods align with es- tablished desired explanation properties. Through various evaluation protocols, we compare distinct explanation methods, revealing that no single method outperforms all others universally. Optimal selection depends on the task requiring the explanation. Additionally, we inspect common explanation methods’ robustness against
adversarial attacks. Our findings demonstrate that explanation methods that are claimed to be robust can be manipulated by crafting attacks involving non-additive perturbations. Moreover, we propose a novel class of attacks using sparse pertur- bations to manipulate explanations, showcasing their potential to mitigate bias in models. These findings hold significant implications for enhancing the robustness of explanation methods.
Finally, as an application of IML, we have devised a recommender system utilizing graph neural networks (GNNs). This recommender system excels in recommendation accuracy while also promoting recommendation interpretability. By leveraging user interaction subgraphs, we present in-depth interpretations for recommendations, highlighting the key users and items that influenced specific suggestions. This endeavor is significant as explainability research is scarce in the area of recommender systems. Moreover, interpretability can potentially help business owners to improve recommender systems in their platforms.
To conclude, we provide a comprehensive discussion on the significance of explainability methods and strategies for enhancing our understanding of their mechanisms. We also discuss future directions to improve ML explanations. We believe that interpretability is a pivotal concern that will become more prominent in the years ahead. With this work, we strive to contribute to the ongoing journey towards a more transparent and accountable Machine Learning.

SERVAL

serveur académique lausannois

Advances in Interpretable Machine Learning : Applications, Analytics, Robustness

Details