Hi,
Some months have passed since my last post. Model explainability is easy for simple models (linear regression, decision trees), and some tools exist for more complex algorithms (ensemble trees). I want to dig into the tools to interpret more complex models with this post. Therefore, I highly recommend the book Interpretable Machine Learning by Christoph Molnar for a deeper theoretical understanding. All different approaches for model explanability are shown with a RandomForest model in this kaggle notebook.
Complex models produce helpful predictions but often behave as a black box. This means that an explanation of the model’s behavior doesn’t exist. However, stakeholders often request an explanation of the model prediction to check for causality and to gain trust in the model.
Feature importance
Feature importance describes how useful a feature is at predicting a target variable. It also takes into account all interactions with other features. Additionally, feature importance can also used for dimensionality reduction and feature selection. There are two ways of computing the feature importance. One way is via impurity importance (mean decrease of impurity), and the second way is via permutation importance (average decrease accuracy). Permutation importance is model agnostic and should be preferred. A general drawback is correlated features. In this case, both features will score a lower importance, where they will be important. Please have a look at permutation importance with Multicollinear or Correlated Features.
The RandomForest implementation has an inbuild feature importance method. The impurity importance of each variable is the sum of the impurity decrease of all trees when it is selected to split a node. Impurity is quantified by the splitting criterion of the decision trees (e.g. Gini, Log Loss, or Mean Squared Error). A weakness of this method is in the case of overfitting. It can address the high importance of features on unseen data, which doesn’t represent reality. Additionally, impurity-based feature importance is strongly biased towards high cardinality features (numerical features). This method assigns lower scores to low cardinality features (binary features, categorical variables).
Permutation-based feature importances do not exhibit such a bias (Scikit-learn inspect module). The permutation importance of a variable is calculated by randomly permuting a single feature and its effect on the model output. This randomized permutation breaks the linkage between feature and target. This also makes the interpretation of the results easy, since the feature importance shows the increase of the model error when this feature doesn’t exist.
You can find additional information here:
- Random Forest by Breiman 2001
- All Models Are Wrong, but Many are Useful: Learning a Variable’s Importance by Studying an Entire Class of Prediction Models Simultaneously by Fisher et al. 2018
- Scikit-learn implementation of permutation importance, but not easy to read/understand
Partial Dependence Plot
The partial dependence plot (short PDP or PD plot) shows how one or two features affect the model prediction. It shows the relationship between the target and a feature. The partial dependence plot shows the average prediction for a feature value. For the impact of individual samples, you can use ICE plots.
The interpretation of partial dependant plots is very intuitive. It shows how the average prediction changes when a feature value is changed. Also, this only works when the features are uncorrelated. With correlated features, we use data points in our computation, which are highly unlikely. The partial dependence is limited to one or two features since more dimensions are hard to visualize.
Like a PDP, an individual conditional expectation (ICE) plot shows the dependence between the target and a feature. However, unlike a PDP, which shows the average effect of the input feature, an ICE plot visualizes the dependence of the prediction on a feature for each sample separately with one line per sample. The advantage is that the ICE plot provides more insights into weak feature interactions. The calculation will take a bit, so only one feature will be displayed here.
You can find additional information here:
Shapley values
SHAP (Shapley Additive exPlanations) is a method to explain individual predictions. It is based on Shapley’s values and originates from cooperative game theory. It is a method for assigning payouts to players depending on their contribution to the total payout. Players cooperate in a coalition and receive a profit from this cooperation. Machine learning uses the difference of a single prediction towards the average prediction. The Shapley value is the average contribution of a feature value across all possible coalitions. Keep in mind that the calculation of Shapley values can be computationally expensive. The difference between the SHAP value and the average prediction is fairly distributed among the feature values. This allows a contrastive explanation down to one single data point. The Shapley value returns a simple value per feature without a prediction model. Shap cannot be used to make statements about changes in prediction for changes in the input. Another disadvantage is that the calculation needs the training data. And like other permutation-based methods, Shapley values suffer from correlated features. You can find additional information here:
- A Unified Approach to Interpreting Model Predictions by Lundberg and Lee 2017
- SHAP implementation
LIME
Local interpretable model-agnostic explanations (LIME) is a paper in which the authors propose a concrete implementation of interpretable models to explain individual predictions of machine learning models. LIME trains simple models to approximate the predictions of the underlying model. As a first step, a new dataset is generated consisting of perturbed samples with corresponding predictions of the model. On this new dataset, LIME trains an interpretable model (decision tree, linear regression), which is weighted by the proximity of the sampled instances to the instance of interest.
One problem is that the calculation of the artificial datasets has its weaknesses. Multiple settings need to be tested before a final LIME model can be used. Another problem is that the interpretability of close points can vary greatly. This instability leads to a critical evaluation of any explanation.
You can find additional information here:
- “Why Should I Trust You?”: Explaining the Predictions of Any Classifier by Rebeiro et al. 2016
- LIME implementation for tabular data
Thank you for your attention.