omnixai.explanations.tabular package
Feature importance explanations. |
|
Counterfactual explanations. |
|
Partial dependence plots. |
|
Morris sensitivity analysis. |
|
Explanations for linear models. |
|
Explanations for tree-based models. |
|
Feature correlation analysis. |
omnixai.explanations.tabular.feature_importance module
Feature importance explanations.
- class omnixai.explanations.tabular.feature_importance.FeatureImportance(mode, explanations=None)
Bases:
ExplanationBase
The class for feature importance explanations. It uses a list to store the feature importance explanations of the input instances. Each item in the list is a dict with the following format {“instance”: the input instance, “features”: a list of feature names, “values”: a list of feature values, “scores”: a list of feature importance scores}. If the task is classification, the dict has an additional entry {“target_label”: the predicted label of the input instance}.
- Parameters
mode – The task type, e.g., classification or regression.
explanations – The explanation results for initializing
FeatureImportance
, which is optional.
- add(instance, target_label, feature_names, feature_values, importance_scores, sort=False, **kwargs)
Adds the generated explanation corresponding to one instance.
- Parameters
instance – The instance to explain.
target_label – The label to explain, which is ignored for regression.
feature_names – The list of the feature column names.
feature_values – The list of the feature values.
importance_scores – The list of the feature importance scores.
sort – True if the features are sorted based on the importance scores.
- get_explanations(index=None)
Gets the generated explanations.
- Parameters
index – The index of an explanation result stored in
FeatureImportance
. Whenindex
is None, the function returns a list of all the explanations.- Returns
The explanation for one specific instance (a dict) or the explanations for all the instances (a list of dicts). Each dict has the following format: {“instance”: the input instance, “features”: a list of feature names, “values”: a list of feature values, “scores”: a list of feature importance scores}. If the task is classification, the dict has an additional entry {“target_label”: the predicted label of the input instance}.
- Return type
Union[Dict, List]
- plot(index=None, class_names=None, num_features=20, max_num_subplots=4, **kwargs)
Plots feature importance scores.
- Parameters
index – The index of an explanation result stored in
FeatureImportance
, e.g., it will plot the first explanation result whenindex = 0
. Whenindex
is None, it shows a figure withmax_num_subplots
subplots where each subplot plots the feature importance scores for one instance.class_names – A list of the class names indexed by the labels, e.g.,
class_name = ['dog', 'cat']
means that label 0 corresponds to ‘dog’ and label 1 corresponds to ‘cat’.num_features – The maximum number of features to plot.
max_num_subplots – The maximum number of subplots in the figure.
- Returns
A matplotlib figure plotting feature importance scores.
- plotly_plot(index=0, class_names=None, num_features=20, **kwargs)
Plots feature importance scores for one specific instance using Dash.
- Parameters
index – The index of an explanation result stored in
FeatureImportance
which cannot be None, e.g., it will plot the first explanation result whenindex = 0
.class_names – A list of the class names indexed by the labels, e.g.,
class_name = ['dog', 'cat']
means that label 0 corresponds to ‘dog’ and label 1 corresponds to ‘cat’.num_features – The maximum number of features to plot.
- Returns
A plotly dash figure plotting feature importance scores.
- ipython_plot(index=0, class_names=None, num_features=20, **kwargs)
Plots the feature importance scores in IPython.
- Parameters
index – The index of an explanation result stored in
FeatureImportance
, which cannot be None, e.g., it will plot the first explanation result whenindex = 0
.class_names – A list of the class names indexed by the labels, e.g.,
class_name = ['dog', 'cat']
means that label 0 corresponds to ‘dog’ and label 1 corresponds to ‘cat’.num_features – The maximum number of features to plot.
- classmethod from_dict(d)
- class omnixai.explanations.tabular.feature_importance.GlobalFeatureImportance
Bases:
ExplanationBase
The class for global feature importance scores. It uses a dict to store the feature importance scores with the following format {“features”: a list of feature names, “scores”: a list of feature importance scores}.
- add(feature_names, importance_scores, sort=False, **kwargs)
Adds the generated feature importance scores.
- Parameters
feature_names – The list of the feature column names.
importance_scores – The list of the feature importance scores.
sort – True if the features are sorted based on the importance scores.
- get_explanations()
Gets the generated explanations.
- Returns
The feature importance scores. The returned dict has the following format: {“features”: a list of feature names, “scores”: a list of feature importance scores}.
- Return type
Dict
- plot(num_features=20, truncate_long_features=True, **kwargs)
Plots feature importance scores.
- Parameters
num_features – The maximum number of features to plot.
truncate_long_features – Flag to truncate long feature names
- Returns
A matplotlib figure plotting feature importance scores.
- plotly_plot(num_features=20, truncate_long_features=True, **kwargs)
Plots feature importance scores for one specific instance using Dash.
- Parameters
num_features – The maximum number of features to plot.
truncate_long_features – Flag to truncate long feature names
- Returns
A plotly dash figure plotting feature importance scores.
- ipython_plot(num_features=20, truncate_long_features=True, **kwargs)
Plots the feature importance scores in IPython.
- Parameters
num_features – The maximum number of features to plot.
truncate_long_features – Flag to truncate long feature names
- classmethod from_dict(d)
omnixai.explanations.tabular.counterfactual module
Counterfactual explanations.
- class omnixai.explanations.tabular.counterfactual.CFExplanation(explanations=None)
Bases:
ExplanationBase
The class for counterfactual explanation results. It uses a list to store the generated counterfactual examples. Each item in the list is a dict with the following format: {“query”: the original input instance, “counterfactual”: the generated counterfactual examples}. Both “query” and “counterfactual” are pandas dataframes with an additional column “label” which stores the predicted labels of these instances.
- add(query, cfs, **kwargs)
Adds the generated explanation corresponding to one instance.
- Parameters
query – The instance to explain.
cfs – The generated counterfactual examples.
kwargs – Additional information to store.
- get_explanations(index=None)
Gets the generated counterfactual explanations.
- Parameters
index – The index of an explanation result stored in
CFExplanation
. When it is None, it returns a list of all the explanations.- Returns
The explanation for one specific instance (a dict) or all the explanations for all the instances (a list). Each dict has the following format: {“query”: the original input instance, “counterfactual”: the generated counterfactual examples}. Both “query” and “counterfactual” are pandas dataframes with an additional column “label” which stores the predicted labels of these instances.
- Return type
Union[Dict, List]
- plot(index=None, class_names=None, font_size=10, **kwargs)
Returns a list of matplotlib figures showing the explanations of one or the first 5 instances.
- Parameters
index – The index of an explanation result stored in
CFExplanation
. For example, it will plot the first explanation result whenindex = 0
. Whenindex
is None, it plots the explanations of the first 5 instances.class_names – A list of the class names indexed by the labels, e.g.,
class_name = ['dog', 'cat']
means that label 0 corresponds to ‘dog’ and label 1 corresponds to ‘cat’.font_size – The font size of table entries.
- Returns
A list of matplotlib figures plotting counterfactual examples.
- plotly_plot(index=0, class_names=None, **kwargs)
Plots the generated counterfactual examples in Dash.
- Parameters
index – The index of an explanation result stored in
CFExplanation
, which cannot be None, e.g., it will plot the first explanation result whenindex = 0
.class_names – A list of the class names indexed by the labels, e.g.,
class_name = ['dog', 'cat']
means that label 0 corresponds to ‘dog’ and label 1 corresponds to ‘cat’.
- Returns
A plotly dash figure showing the counterfactual examples.
- ipython_plot(index=0, class_names=None, **kwargs)
Plots the generated counterfactual examples in IPython.
- Parameters
index – The index of an explanation result stored in
CFExplanation
, which cannot be None, e.g., it will plot the first explanation result whenindex = 0
.class_names – A list of the class names indexed by the labels, e.g.,
class_name = ['dog', 'cat']
means that label 0 corresponds to ‘dog’ and label 1 corresponds to ‘cat’.
- classmethod from_dict(d)
omnixai.explanations.tabular.pdp module
Partial dependence plots.
- class omnixai.explanations.tabular.pdp.PDPExplanation(mode)
Bases:
ExplanationBase
The class for PDP explanation results. The PDP explanation results are stored in a dict. The key in the dict is “global” indicating PDP is a global explanation method. The value in the dict is another dict with the following format: {feature_name: {“values”: the PDP grid values, “scores”: the average PDP scores, “sampled_scores”: the PDP scores computed with Monte-Carlo samples}}.
- Parameters
mode – The task type, e.g., classification or regression.
- add(feature_name, values, scores, sampled_scores=None)
Adds the raw values of the partial dependence function corresponding to one specific feature.
- Parameters
feature_name – The feature column name.
values – The features values.
scores – The average PDP scores corresponding to the values.
sampled_scores – The PDP scores computed with Monte-Carlo samples.
- get_explanations()
Gets the partial dependence scores.
- Returns
A dict containing the partial dependence scores of all the studied features with the following format: {feature_name: {“values”: the feature values, “scores”: the average PDP scores, “sampled_scores”: the PDP scores computed with Monte-Carlo samples}}.
- plot(class_names=None, **kwargs)
Returns a matplotlib figure showing the PDP explanations.
- Parameters
class_names – A list of the class names indexed by the labels, e.g.,
class_name = ['dog', 'cat']
means that label 0 corresponds to ‘dog’ and label 1 corresponds to ‘cat’.- Returns
A matplotlib figure plotting PDP explanations.
- plotly_plot(class_names=None, **kwargs)
Returns a plotly dash figure showing the PDP explanations.
- Parameters
class_names – A list of the class names indexed by the labels, e.g.,
class_name = ['dog', 'cat']
means that label 0 corresponds to ‘dog’ and label 1 corresponds to ‘cat’.- Returns
A plotly dash figure plotting PDP explanations.
- ipython_plot(class_names=None, **kwargs)
Shows the partial dependence plots in IPython.
- Parameters
class_names – A list of the class names indexed by the labels, e.g.,
class_name = ['dog', 'cat']
means that label 0 corresponds to ‘dog’ and label 1 corresponds to ‘cat’.
- classmethod from_dict(d)
omnixai.explanations.tabular.sensitivity module
Morris sensitivity analysis.
- class omnixai.explanations.tabular.sensitivity.SensitivityExplanation
Bases:
ExplanationBase
The class for sensitivity analysis results. The results are stored in a dict with the following format: {feature_name: {“mu”: Morris mu, “mu_star”: Morris mu_star, “sigma”: Morris sigma, “mu_star_conf”: Morris mu_star_conf}}.
- add(feature_name, mu, mu_star, sigma, mu_star_conf)
Adds the sensitivity analysis result of a specific feature.
- Parameters
feature_name – The feature column name.
mu – mu.
mu_star – mu_star.
sigma – sigma.
mu_star_conf – mu_star_conf.
- get_explanations()
Gets the Morris sensitivity analysis results.
- Returns
A dict containing the raw values of the partial dependence function for all the features with the following format: {feature_name: {“mu”: Morris mu, “mu_star”: Morris mu_star, “sigma”: Morris sigma, “mu_star_conf”: Morris mu_star_conf}}.
- Return type
Dict
- plot(**kwargs)
Returns a matplotlib figure showing the sensitivity analysis results.
- Returns
A matplotlib figure.
- plotly_plot(**kwargs)
Returns a plotly dash figure showing sensitivity analysis results.
- ipython_plot(**kwargs)
Plots sensitivity analysis results in IPython.
- classmethod from_dict(d)
omnixai.explanations.tabular.linear module
Explanations for linear models.
- class omnixai.explanations.tabular.linear.LinearExplanation(mode)
Bases:
ExplanationBase
The class for explanation results for linear models. The results are stored in a dict with the following format: {“coefficients”: the linear coefficients, “scores”: the feature importance scores of a batch of instances, “outputs”: the predicted values of a batch of instances}. The value of “scores” is a dict whose keys are feature names and values are feature importance scores.
- Parameters
mode – The task type, e.g., classification or regression.
- add(coefficients, importance_scores, outputs)
Adds the generated explanation corresponding to one instance.
- Parameters
coefficients – Linear coefficients.
importance_scores – Feature importance scores, e.g., feature value * coefficient.
outputs – The predictions.
- get_explanations()
Gets the generated explanations.
- Returns
A dict containing the global explanation, i.e., the linear coefficients, and the local explanations for all the instances, i.e., feature importance scores, with the following format: {“coefficients”: the linear coefficients, “scores”: the feature importance scores of a batch of instances, “outputs”: the predicted values of a batch of instances}. The value of “scores” is a dict whose keys are feature names and values are feature importance scores.
- plot(plot_coefficients=False, class_names=None, max_num_subplots=9, font_size=None, **kwargs)
Returns a list of matplotlib figures showing the global and local explanations.
- Parameters
plot_coefficients – Whether to plot linear coefficients.
class_names – A list of the class names indexed by the labels, e.g.,
class_name = ['dog', 'cat']
means that label 0 corresponds to ‘dog’ and label 1 corresponds to ‘cat’.max_num_subplots – The maximum number of subplots in the figure.
font_size – The font size of ticks.
- Returns
A list of matplotlib figures plotting linear coefficients and feature importance scores.
- plotly_plot(index=0, class_names=None, **kwargs)
Returns a plotly dash figure showing the linear coefficients and feature importance scores for one specific instance.
- Parameters
index – The index of the instance which cannot be None, e.g., it will plot the first explanation result when
index = 0
.class_names – A list of the class names indexed by the labels, e.g.,
class_name = ['dog', 'cat']
means that label 0 corresponds to ‘dog’ and label 1 corresponds to ‘cat’.
- Returns
A plotly dash figure plotting linear coefficients and feature importance scores.
- ipython_plot(index=0, class_names=None, **kwargs)
Plots the linear coefficients and feature importance scores in IPython.
- Parameters
index – The index of the instance which cannot be None, e.g., it will plot the first explanation result when
index = 0
.class_names – A list of the class names indexed by the labels, e.g.,
class_name = ['dog', 'cat']
means that label 0 corresponds to ‘dog’ and label 1 corresponds to ‘cat’.
- classmethod from_dict(d)
omnixai.explanations.tabular.tree module
Explanations for tree-based models.
- class omnixai.explanations.tabular.tree.TreeExplanation
Bases:
ExplanationBase
The class for explanation results for tree-based models. The results are stored in a dict with the following format: {“model”: the trained tree model, “tree”: the binary tree extracted from the model, “feature_names”: A list of feature names, “class_names”: A list of class names, “path”: The decision paths for a batch of instances}.
- add_global(model, feature_names, class_names)
Adds the global explanations, i.e., the whole tree structure.
- Parameters
model – The tree model.
feature_names – The feature column names.
class_names – The class names.
- add_local(model, decision_paths, node_indicator, feature_names, class_names)
Adds the local explanations, i.e., decision paths.
- Parameters
model – The tree model.
decision_paths – The decision paths for one instance.
node_indicator – The node indicator.
feature_names – The feature column names.
class_names – The class names.
- get_explanations(index=None)
Gets the generated explanations.
- Parameters
index – The index of the instance, e.g., it will plot the first explanation result when
index = 0
.When it is None, this method return all the explanations.- Returns
The explanations for one specific instance or all the explanations for all the instances.
- plot(index=None, figsize=(15, 10), fontsize=10, **kwargs)
Returns a matplotlib figure showing the explanations.
- Parameters
index – The index of an explanation result stored in
TreeExplanation
, e.g., it will plot the first explanation result whenindex = 0
. Whenindex
is None, it plots the explanations of the first 5 instances.figsize – The figure size.
fontsize – The font size of texts.
- Returns
A list of matplotlib figures plotting the tree and the decision paths.
- plotly_plot(index=0, **kwargs)
Returns a plotly dash figure showing decision paths.
- Parameters
index – The index of the instance which cannot be None, e.g., it will plot the first explanation result when
index = 0
.- Returns
A plotly dash figure plotting decision paths.
- ipython_plot(index=0, figsize=(15, 10), fontsize=10, **kwargs)
Plots decision paths in IPython.
- Parameters
index – The index of the instance which cannot be None, e.g., it will plot the first explanation result when
index = 0
.figsize – The figure size.
fontsize – The font size of texts.
- to_json()
Converts the explanation result into JSON format.
- classmethod from_dict(d)
omnixai.explanations.tabular.correlation module
Feature correlation analysis.
- class omnixai.explanations.tabular.correlation.CorrelationExplanation
Bases:
ExplanationBase
The class for correlation analysis results. The results are stored in a Dict, i.e., {“features”: a list of feature names, “correlation”: the correlation matrix}.
- add(features, correlation)
Adds the count for a cross-feature.
- Parameters
features – The feature names.
correlation – The correlation matrix.
- get_explanations()
Gets the correlation matrix.
- Returns
A Dict for the feature names and the correlation matrix., i.e., {“features”: a list of feature names, “correlation”: the correlation matrix}.
- plot(**kwargs)
Plots the correlation matrix.
- Returns
A matplotlib figure plotting the correlation matrix.
- plotly_plot(**kwargs)
Plots the correlation matrix using Dash.
- Returns
A plotly dash figure plotting the correlation matrix.
- ipython_plot(**kwargs)
Plots the correlation matrix in IPython.
- classmethod from_dict(d)
omnixai.explanations.tabular.imbalance module
Feature imbalance plots.
- class omnixai.explanations.tabular.imbalance.ImbalanceExplanation
Bases:
ExplanationBase
The class for feature imbalance plots. It uses a list to store the feature values and their counts (numbers of appearances) in each class. Each item in the list is a dict, i.e., {“feature”: feature value, “count”: {class label 1: count 1, class label 2: count 2, …}}. If there are no class labels, the dict will be {“feature”: feature value, “count”: count}.
- add(feature, count)
Adds the count for a cross-feature.
- Parameters
feature – A cross-feature (a list of feature values).
count – The number of appearances.
- get_explanations()
Gets the imbalance analysis results.
- Returns
A Dict storing the counts for all the cross-features, i.e., {“feature”: feature value, “count”: {class label 1: count 1, class label 2: count 2, …}}. If there are no class labels, the dict will be {“feature”: feature value, “count”: count}.
- plot(**kwargs)
Shows the imbalance plot.
- Returns
A matplotlib figure plotting the feature counts.
- plotly_plot(**kwargs)
Shows the imbalance plot.
- Returns
A plotly dash figure plotting the feature counts.
- ipython_plot(**kwargs)
Shows the imbalance plot in IPython.
- classmethod from_dict(d)
omnixai.explanations.tabular.validity module
Validity ranking explanation explanations.
- class omnixai.explanations.tabular.validity.ValidityRankingExplanation(explanations=None)
Bases:
ExplanationBase
The class for validity ranking explanation results.
- add(query, df, top_features, validity, **kwargs)
Adds the generated explanation corresponding to one instance.
- Parameters
query – The instance to explain.
df – The dataframe of input query item features.
top_features – The features that explain the ranking
validity – The validity metrics for the top features.
kwargs – Additional information to store.
- get_explanations(index=None)
Gets the generated explanations.
- Parameters
index – The index of an explanation result stored in
ValidityRankingExplanation
. When it is None, it returns a list of all the explanations.- Returns
The explanation for one specific instance (a dict) or all the explanations for all the instances (a list). Each dict has the following format: {“query”: the original input instance, “item”: The dataframe of input query item features, “top_features”: The top features that explain the ranking, “validity”: The validity metrics for the top features.}.
- Return type
Union[Dict, List]
- plot(index=0, font_size=8, **kwargs)
Returns a matplotlib figure showing the explanations.
- Parameters
index – The index of an explanation result stored in
ValidityRankingExplanation
.font_size – The font size of table entries.
- Returns
Matplotlib figure plotting the most important features followed by remaining features.
- plotly_plot(index=0, **kwargs)
Plots the document features and explainable features in Dash.
- Parameters
index – The index of an explanation result stored in
ValidityRankingExplanation
.- Returns
A plotly dash figure showing the important features followed by remaining features
- ipython_plot(index=0, **kwargs)
Plots a table for ipython showing the important features followed by the remaining features.
- static rearrange_columns(df, top_features)
- classmethod from_dict(d)