omnixai.explainers.tabular.agnostic package

`lime`	The LIME explainer for tabular data.
`shap`	The SHAP explainer for tabular data.
`pdp`	The partial dependence plots for tabular data.
`ale`	The accumulated local effects plots for tabular data.
`sensitivity`	Morris sensitivity analysis for tabular data
`L2X.l2x`	The L2X explainer for tabular data.
`permutation`	The permutation feature importance explanation for tabular data.
`bias`	The model bias analyzer for tabular data.
`gpt`	The explainer based ChatGPT.

omnixai.explainers.tabular.agnostic.lime module

The LIME explainer for tabular data.

class omnixai.explainers.tabular.agnostic.lime.LimeTabular(training_data, predict_function, mode='classification', ignored_features=None, **kwargs)

Bases: TabularExplainer

The LIME explainer for tabular data. If using this explainer, please cite the original work: https://github.com/marcotcr/lime.

Parameters

training_data (Tabular) – The data used to train local explainers in LIME. training_data can be the training dataset for training the machine learning model. If the training dataset is large, training_data can be its subset by applying omnixai.sampler.tabular.Sampler.subsample.
predict_function (Callable) – The prediction function corresponding to the model to explain. When the model is for classification, the outputs of the predict_function are the class probabilities. When the model is for regression, the outputs of the predict_function are the estimated values.
mode (str) – The task type, e.g., classification or regression.
ignored_features (Optional[List]) – The features ignored in computing feature importance scores.
kwargs – Additional parameters to initialize lime_tabular.LimeTabularExplainer, e.g., kernel_width and discretizer. Please refer to the doc of lime_tabular.LimeTabularExplainer.

explanation_type = 'local'

alias = ['lime']

explain(X, y=None, **kwargs)

Generates the feature-importance explanations for the input instances.

Parameters

X – A batch of input instances. When X is pd.DataFrame or np.ndarray, X will be converted into Tabular automatically.
y – A batch of labels to explain. For regression, y is ignored. For classification, the top predicted label of each instance will be explained when y = None.
kwargs – Additional parameters used in LimeTabularExplainer.explain_instance, e.g., num_features. Please refer to the doc of LimeTabularExplainer.explain_instance.

Return type

FeatureImportance

Returns

The feature-importance explanations for all the input instances.

save(directory, filename=None, **kwargs)

Saves the initialized explainer.

Parameters

directory (str) – The folder for the dumped explainer.
filename (Optional[str]) – The filename (the explainer class name if it is None).

omnixai.explainers.tabular.agnostic.shap module

The SHAP explainer for tabular data.

class omnixai.explainers.tabular.agnostic.shap.ShapTabular(training_data, predict_function, mode='classification', ignored_features=None, **kwargs)

Bases: TabularExplainer

The SHAP explainer for tabular data. If using this explainer, please cite the original work: https://github.com/slundberg/shap.

Parameters

training_data (Tabular) – The data used to initialize a SHAP explainer. training_data can be the training dataset for training the machine learning model. If the training dataset is large, please set parameter nsamples, e.g., nsamples = 100.
predict_function (Callable) – The prediction function corresponding to the model to explain. When the model is for classification, the outputs of the predict_function are the class probabilities. When the model is for regression, the outputs of the predict_function are the estimated values.
mode (str) – The task type, e.g., classification or regression.
ignored_features (Optional[List]) – The features ignored in computing feature importance scores.
kwargs – Additional parameters to initialize shap.KernelExplainer, e.g., nsamples. Please refer to the doc of shap.KernelExplainer.

explanation_type = 'local'

alias = ['shap']

explain(X, y=None, **kwargs)

Generates the local SHAP explanations for the input instances.

Parameters

X – A batch of input instances. When X is pd.DataFrame or np.ndarray, X will be converted into Tabular automatically.
y – A batch of labels to explain. For regression, y is ignored. For classification, the top predicted label of each instance will be explained when y = None.
kwargs – Additional parameters for shap.KernelExplainer.shap_values, e.g., nsamples – the number of times to re-evaluate the model when explaining each prediction.

Return type

FeatureImportance

Returns

The feature importance explanations.

save(directory, filename=None, **kwargs)

Saves the initialized explainer.

Parameters

directory (str) – The folder for the dumped explainer.
filename (Optional[str]) – The filename (the explainer class name if it is None).

omnixai.explainers.tabular.agnostic.pdp module

The partial dependence plots for tabular data.

class omnixai.explainers.tabular.agnostic.pdp.PartialDependenceTabular(training_data, predict_function, mode='classification', **kwargs)

Bases: TabularExplainer

The partial dependence plots for tabular data. For more information, please refer to https://scikit-learn.org/stable/modules/partial_dependence.html.

Parameters

training_data (Tabular) – The data used to initialize a PDP explainer. training_data can be the training dataset for training the machine learning model. If the training dataset is large, training_data can be its subset by applying omnixai.sampler.tabular.Sampler.subsample.
predict_function – The prediction function corresponding to the model to explain. When the model is for classification, the outputs of the predict_function are the class probabilities. When the model is for regression, the outputs of the predict_function are the estimated values.
mode – The task type, e.g., classification or regression.
kwargs – Additional parameters, e.g., grid_resolution – the number of candidates for each feature during generating partial dependence plots.

explanation_type = 'global'

alias = ['pdp', 'partial_dependence']

explain(features=None, monte_carlo=False, monte_carlo_steps=10, monte_carlo_frac=0.1, **kwargs)

Generates global PDP explanations.

Parameters

features (Optional[List]) – The names of the features to be studied.
monte_carlo (bool) – Whether computing PDP for Monte Carlo samples.
monte_carlo_steps (int) – The number of Monte Carlo sampling steps.
monte_carlo_frac (float) – The number of randomly selected samples in each Monte Carlo step.

Return type

PDPExplanation

Returns

The generated PDP explanations.

omnixai.explainers.tabular.agnostic.ale module

The accumulated local effects plots for tabular data.

class omnixai.explainers.tabular.agnostic.ale.ALE(training_data, predict_function, mode='classification', **kwargs)

Bases: TabularExplainer

The accumulated local effects (ALE) plots for tabular data. For more information, please refer to https://christophm.github.io/interpretable-ml-book/ale.html.

Parameters

training_data (Tabular) – The data used to initialize the explainer. training_data can be the training dataset for training the machine learning model. If the training dataset is large, training_data can be its subset by applying omnixai.sampler.tabular.Sampler.subsample.
predict_function – The prediction function corresponding to the model to explain. When the model is for classification, the outputs of the predict_function are the class probabilities. When the model is for regression, the outputs of the predict_function are the estimated values.
mode – The task type, e.g., classification or regression.
kwargs – Additional parameters, e.g., grid_resolution – the number of candidates for each feature.

explanation_type = 'global'

alias = ['ale', 'accumulated_local_effects']

static cmds(mat, k=1): Classical multidimensional scaling. Please refer to: https://en.wikipedia.org/wiki/Multidimensional_scaling#Classical_multidimensional_scaling

explain(features=None, monte_carlo=True, monte_carlo_steps=10, monte_carlo_frac=0.1, **kwargs)

Generates accumulated local effects (ALE) plots.

Parameters

features (Optional[List]) – The names of the features to be studied.
monte_carlo (bool) – Whether computing ALE plots for Monte Carlo samples.
monte_carlo_steps (int) – The number of Monte Carlo sampling steps.
monte_carlo_frac (float) – The number of randomly selected samples in each Monte Carlo step.

Return type

ALEExplanation

Returns

The generated ALE explanations.

omnixai.explainers.tabular.agnostic.sensitivity module

Morris sensitivity analysis for tabular data

class omnixai.explainers.tabular.agnostic.sensitivity.SensitivityAnalysisTabular(training_data, predict_function, **kwargs)

Bases: TabularExplainer

Morris sensitivity analysis for tabular data based on the SALib. If using this explainer, please cite the package: https://github.com/SALib/SALib. This explainer only supports continuous-valued features.

Parameters

training_data (Tabular) – The data used to initialize the explainer. training_data can be the training dataset for training the machine learning model. If the training dataset is large, training_data can be its subset by applying omnixai.sampler.tabular.Sampler.subsample.
predict_function (Callable) – The prediction function corresponding to the model to explain. The outputs of the predict_function should be a batch of estimated values, e.g., class probabilities are not supported.

explanation_type = 'global'

alias = ['sa', 'sensitivity']

explain(**kwargs)

Generates sensitivity analysis explanations.

Parameters: kwargs – Additional parameters, e.g., nsamples – the number of samples in Morris sampling.
Return type: SensitivityExplanation
Returns: The generated global explanations.

omnixai.explainers.tabular.agnostic.L2X.l2x module

The L2X explainer for tabular data.

class omnixai.explainers.tabular.agnostic.L2X.l2x.DefaultSelectionModel(explainer, **kwargs)

Bases: _DefaultModelBase

The default selection model in L2X for tabular data. It is a simple feedforward neural network with three linear layers. The categorical features are mapped to embeddings.

Parameters

explainer – A L2XTabular explainer.
kwargs – Additional parameters, e.g., hidden_size – the hidden layer size.

forward(inputs)

Parameters: inputs – The model inputs.

training: bool

class omnixai.explainers.tabular.agnostic.L2X.l2x.DefaultPredictionModel(explainer, **kwargs)

Bases: _DefaultModelBase

The default prediction model in L2X for tabular data. It is a simple feedforward neural network with three linear layers. The categorical features are mapped to embeddings.

Parameters

explainer – A L2XTabular explainer.
kwargs – Additional parameters, e.g., hidden_size – the hidden layer size.

forward(inputs, weights)

Parameters

inputs – The model inputs.
weights – The weights generated via Gumbel-Softmax sampling.

training: bool

class omnixai.explainers.tabular.agnostic.L2X.l2x.L2XTabular(training_data, predict_function, mode='classification', tau=0.5, k=8, selection_model=None, prediction_model=None, loss_function=None, optimizer=None, learning_rate=0.001, batch_size=None, num_epochs=10, **kwargs)

Bases: TabularExplainer

The L2X explainer for tabular data. If using this explainer, please cite the original work: Learning to Explain: An Information-Theoretic Perspective on Model Interpretation, Jianbo Chen, Le Song, Martin J. Wainwright, Michael I. Jordan, https://arxiv.org/abs/1802.07814.

Parameters

training_data (Tabular) – The data used to train the explainer. training_data should be the training dataset for training the machine learning model.
predict_function (Callable) – The prediction function corresponding to the model to explain. When the model is for classification, the outputs of the predict_function are the class probabilities. When the model is for regression, the outputs of the predict_function are the estimated values.
mode (str) – The task type, e.g., classification or regression.
tau (float) – Parameter tau in Gumbel-Softmax.
k (int) – The maximum number of the selected features in L2X.
selection_model – A pytorch model class for estimating P(S|X) in L2X. If selection_model = None, a default model DefaultSelectionModel will be used.
prediction_model – A pytorch model class for estimating Q(X_S) in L2X. If prediction_model = None, a default model DefaultPredictionModel will be used.
loss_function (Optional[Callable]) – The loss function for the task, e.g., nn.CrossEntropyLoss() for classification.
optimizer – The optimizer class for training the L2X explainer, e.g., torch.optim.Adam.
learning_rate (float) – The learning rate for training the L2X explainer.
batch_size (Optional[int]) – The batch size for training the L2X explainer. If batch_size is None, batch_size will be picked from [32, 64, 128, 256] based on the sample size.
num_epochs (int) – The number of epochs for training the L2X explainer.
kwargs – Additional parameters, e.g., parameters for selection_model and prediction_model.

explanation_type = 'local'

alias = ['l2x', 'L2X']

explain(X, **kwargs)

Generates the explanations corresponding to the input instances. For classification, it explains the top predicted label for each input instance.

Parameters

X – A batch of input instances. When X is pd.DataFrame or np.ndarray, X will be converted into Tabular automatically.
kwargs – Not used here.

Return type

FeatureImportance

Returns

The feature-importance explanations for all the input instances.

save(directory, filename=None, **kwargs)

Saves the initialized explainer.

Parameters

directory (str) – The folder for the dumped explainer.
filename (Optional[str]) – The filename (the explainer class name if it is None).

omnixai.explainers.tabular.agnostic.permutation module

The permutation feature importance explanation for tabular data.

class omnixai.explainers.tabular.agnostic.permutation.PermutationImportance(training_data, predict_function, mode='classification', **kwargs)

Bases: ExplainerBase, TabularExplainerMixin

The permutation feature importance explanations for tabular data. The permutation feature importance is defined to be the decrease in a model score when a single feature value is randomly shuffled.

Parameters

training_data (Tabular) – The training dataset for training the machine learning model.
predict_function – The prediction function corresponding to the model to explain. When the model is for classification, the outputs of the predict_function are the class probabilities. When the model is for regression, the outputs of the predict_function are the estimated values.
mode – The task type, e.g., classification or regression.

explanation_type = 'global'

alias = ['permutation']

explain(X, y, n_repeats=30, score_func=None)

Generate permutation feature importance scores.

Parameters

X (Tabular) – Data on which permutation importance will be computed.
y (Union[ndarray, DataFrame]) – Targets or labels.
n_repeats (int) – The number of times a feature is randomly shuffled.
score_func (Optional[Callable]) – The score function measuring the difference between ground-truth targets and predictions, e.g., -sklearn.metrics.log_loss(y_true, y_pred).

Return type

GlobalFeatureImportance

Returns

The permutation feature importance explanations.

omnixai.explainers.tabular.agnostic.shap_global module

The SHAP explainer for global feature importance.

class omnixai.explainers.tabular.agnostic.shap_global.GlobalShapTabular(training_data, predict_function, mode='classification', ignored_features=None, **kwargs)

Bases: TabularExplainer

The SHAP explainer for global feature importance. If using this explainer, please cite the original work: https://github.com/slundberg/shap.

Parameters

training_data (Tabular) – The data used to initialize a SHAP explainer. training_data can be the training dataset for training the machine learning model. If the training dataset is large, please set parameter nsamples, e.g., nsamples = 100.
predict_function (Callable) – The prediction function corresponding to the model to explain. When the model is for classification, the outputs of the predict_function are the class probabilities. When the model is for regression, the outputs of the predict_function are the estimated values.
mode (str) – The task type, e.g., classification or regression.
ignored_features (Optional[List]) – The features ignored in computing feature importance scores.
kwargs – Additional parameters to initialize shap.KernelExplainer, e.g., nsamples. Please refer to the doc of shap.KernelExplainer.

explanation_type = 'global'

alias = ['shap_global']

explain(X=None, **kwargs)

Generates the global SHAP explanations.

Parameters

X (Optional[Tabular]) – The data will be used to compute global SHAP values, i.e., the mean of the absolute SHAP value for each feature. If X is None, a set of training samples will be used.
kwargs – Additional parameters for shap.KernelExplainer.shap_values, e.g., nsamples – the number of times to re-evaluate the model when explaining each prediction.

Returns

The global feature importance explanations.

save(directory, filename=None, **kwargs)

Saves the initialized explainer.

Parameters

directory (str) – The folder for the dumped explainer.
filename (Optional[str]) – The filename (the explainer class name if it is None).

omnixai.explainers.tabular.agnostic.bias module

The model bias analyzer for tabular data.

class omnixai.explainers.tabular.agnostic.bias.BiasAnalyzer(training_data, predict_function, mode='classification', training_targets=None, **kwargs)

Bases: ExplainerBase

The bias analysis for a classification or regression model.

Parameters

training_data (Tabular) – The data used to initialize the explainer.
predict_function – The prediction function corresponding to the model to explain. When the model is for classification, the outputs of the predict_function are the class probabilities. When the model is for regression, the outputs of the predict_function are the estimated values.
mode – The task type, e.g., classification or regression.
training_targets (Optional[List]) – The training labels/targets. If it is None, the target column in training_data will be used. The values of training_targets can only be integers (e.g., classification labels) or floats (regression targets).

explanation_type = 'global'

alias = ['bias']

explain(feature_column, feature_value_or_threshold, label_value_or_threshold, **kwargs)

Runs bias analysis on the given model and dataset.

Parameters

feature_column – The feature column to analyze.
feature_value_or_threshold – The feature value for a categorical feature or feature value threshold for a continuous-value feature. It can either be a single value or a list/tuple. When it is a single value, (a) for categorical features, the advantaged group will be those samples contains this feature value and the disadvantaged group will be the other samples, (b) for continuous-valued features, the advantaged group will be those samples whose values of feature_column <= feature_value_or_threshold and the disadvantaged group will be the other samples. When it is a list/tuple, (a) for categorical features, the advantaged group will be the samples contains the feature values in the first element in the list and the disadvantaged group will be the samples contains the feature values in the second element in the list. (b) for continuous-valued features, if feature_value_or_threshold is [a, b], then the advantaged group will be the samples whose values of feature_column <= a and the disadvantaged group will be the samples whose values of feature_column > b. If feature_value_or_threshold is [a, [b, c]], the disadvantaged group will be the samples whose values of feature_column is in (b, c].
label_value_or_threshold – The target label for classification or target threshold for regression. For regression, it will be converted into a binary classification problem when computing bias metrics, i.e., label = 0 if target value <= target_value_or_threshold, and label = 1 if target value > target_value_or_threshold.

Return type

BiasExplanation

Returns

The bias analysis results stored in BiasExplanation.

omnixai.explainers.tabular.agnostic.gpt module

The explainer based ChatGPT.

class omnixai.explainers.tabular.agnostic.gpt.GPTExplainer(training_data, predict_function, apikey, mode='classification', ignored_features=None, include_counterfactual=True, openai_model='gpt-3.5-turbo', **kwargs)

Bases: ExplainerBase

The explainer based on ChatGPT. The input prompt consists of the feature importance scores and the counterfactual examples (if used). The explanations will be the text generated by ChatGPT.

Parameters

training_data (Tabular) – The data used to initialize a SHAP explainer. training_data can be the training dataset for training the machine learning model.
predict_function (Callable) – The prediction function corresponding to the model to explain. When the model is for classification, the outputs of the predict_function are the class probabilities. When the model is for regression, the outputs of the predict_function are the estimated values.
apikey (str) – The OpenAI API Key.
mode (str) – The task type, e.g., classification or regression.
ignored_features (Optional[List]) – The features ignored in computing feature importance scores.
include_counterfactual (bool) – Whether to include counterfactual explanations in the results.
openai_model (str) – The model type for chat completion.
kwargs – Additional parameters to initialize shap.KernelExplainer, e.g., nsamples. Please refer to the doc of shap.KernelExplainer.

explanation_type = 'local'

alias = ['gpt']

explain(X, **kwargs)

Return type: PlainText