Explainability – A Must-Have Skill in a Data Scientist's Toolkit

By Data Science Salon

A large number of organizations have been generating and collecting data for a while now – only to realize that it does not take data assets long to convert into a liability if not treated well.

Despite building the best infrastructure platform and possessing the technical skills to model the data, enterprises often stumble upon snags when they are not able to make sense of data. The issue only exacerbates when AI enters the picture. AI and machine learning (ML) algorithms are largely black-box in nature i.e. they are not able to explain how a particular output is arrived at.

With increasing use cases that capitalize on the power of AI and ML algorithms, explainable models have become key enablers for organizations to truly become insights-driven.

Need for Explainable Models

Explainable models overcome the limitation of the black-box nature of ML models – they are designed to make machine learning models interpretable. The glass-box characteristic of explainable models helps the stakeholders understand the decision-making process (Figure 1). Linear regression, decision trees, and k-nearest neighbors are some examples of explainable models.

Explainability assists the developer as well as the user in answering questions like:

  • When to trust the model and when not to?
  • How did the model arrive at a particular prediction?
  • What change in the input vector will lead to a change in model outcome, commonly known as flipped prediction?
  • Why did the model not yield another possible output, for example, why it predicted class A and not class B?
  • How would a developer know that the model is making mistakes so that it can be corrected?

Figure 1. XAI Concept. Source

Importance of Explainability

Explainability, inarguably, fuels the insights from the predictions. But there are some additional benefits to it such as:

  • Trust: For people to trust a machine learning model and use it to make decisions, they need to understand its internal working. Explainable models help in explaining how it arrived at the predictions which in turn builds trust in the algorithmic solution.
  • Fairness: The process of understanding the model predictions acts as a proxy of auditing whether the model has learned the right and intended patterns. Model explainability thus warrants that it is acting fair to all strata of society. Any biased behavior could be quickly identified and corrected to ensure that it is always ethically compliant.
  • Compliance: The regulations surrounding high-risk and high-impact industries, such as finance and healthcare, mandate the use of explainable models to ensure transparency around model decisions.  
  • Debugging: With complex and black-box models, it can be difficult to understand the factors contributing to predictions. Explainable models bring the potential bugs to notice as part of error analysis and help developers with a lens to rectify the model. 
  • Human-AI Collaboration: In some tasks, humans and AI could work together more efficiently if AI models are more explainable. It's an extension of the human-in-the-loop concept and knowing the reason for the model’s output fuels this collaboration.

Common Explainability Methods

Varying degrees of explainability depend on two key factors – the model complexity and the assumptions about the data. There are two broad ways to explain the model outcomes:

Global Explainability Frameworks

They provide a high-level overview of a model's behavior – commonly used to understand the working of a model as a whole and how it makes predictions across a wide range of data points. Feature importance, model accuracy, and decision boundary are some of the popular frameworks to understand the model’s behavior at an aggregated level.

Local Explainability Frameworks

They are also called instance-level or case-based explanations focusing on explaining how the model arrived at a prediction for a single record. Some of the frequently used local explainability frameworks include SHAP, LIME, counterfactual analysis, etc.

In a nutshell, the global explainability frameworks provide a general overview of a model's behavior and decision-making process, whereas local explainability frameworks provide granular and detailed information on how the model arrived at a specific prediction.

Let’s discuss some of the key explainability frameworks in detail.

Local Interpretable Model-Agnostic Explanations (LIME)

LIME trains a simple interpretable model such as a linear or a decision tree to approximate the behavior of the complex model, but only locally around a specific data point. This allows LIME to explain the predictions of the complex model for a specific input, without needing to understand the overarching complexity surrounding the model's decision-making process as a whole.

So how does LIME perform approximation? It first generates a set of "perturbed" versions of the input data point – each with small and random changes to the input features. These perturbed versions of the data are passed through the complex model to get the corresponding predictions. Then, LIME fits a simple interpretable model to the predicted values on perturbed data to approximate the complex model decision behavior.

Once the interpretable model is trained, it can be used to explain the complex model's prediction by showing the relationship between the input features and the predicted value. The importance of each feature in the interpretation is based on the weights of the interpretable model. Besides providing insights into how the complex model arrived at its prediction for the specific input, such a framework also highlights the most important features of the response variable as shown in the illustration below:


Figure 2. Source


Figure 2 describes one of the instances where the model predicted a poisonous substance based on the features like odor, gill-size, etc. The attributes shown in orange color tend to push the prediction in favor of a poisonous substance while the attributes in blue pull the predictions towards predicted an edible substance. This way a foul odor strongly relates to a poisonous substance while a broad gill size slightly signals an edible material.

It is a model-agnostic framework and can be used to interpret any machine learning model, including deep neural networks, gradient boosting models, etc. LIME can be installed using the below command.

pip install lime

SHapley Additive exPlanations (SHAP)

SHAP is another local interpretability framework that is based on the concept of cooperative game theory. It assigns a unique value to each feature in a dataset such that the sum of the values for all features is equal to the model's prediction. The unique value of a feature is calculated by averaging over all possible coalitions of features. Thus, it considers all possible subsets of features for each instance and assigns a value to each feature in the subset. 

This value is calculated by the difference between – 

  • The average prediction for the instances that have the feature in the subset and

  • The average prediction for the instances that do not have the feature.

SHAP values are unique, consistent, and model-agnostic, which makes them a good tool for understanding how different features contribute to the predictions of a model. Besides, it is more robust to correlated features which means that it easily identifies the important and relevant features.


The Python library for SHAP can be installed using the below command.

pip install shap

A popular way to visualize SHAP values is by using Force Plots. The red feature values push the output higher whereas the blue feature values pull it down.

Figure 3. Source


Recent times have witnessed an increased need and demand for explainable frameworks in the AI and machine learning community. This post highlighted the importance of explainability in ML solutions followed by the various classes of explainable models.

Understanding different types of explainability frameworks along with intuition of their internal workings – is one of the most important and must-have skills among AI practitioners. Not only do such frameworks help the developers in effective model debugging, but they are also crucial to explain the model behavior to the technical and non-technical stakeholders alike.

Get the latest data science news and resources every Friday right to your inbox!