Integrated-gradient on IMDB dataset (Tensorflow)
This is an example of the integrated-gradient method on text classification with a Tensorflow model. If using this explainer, please cite the original work: https://github.com/ankurtaly/Integrated-Gradients.
[1]:
import unittest
import numpy as np
import pandas as pd
import tensorflow as tf
import sklearn
from sklearn.datasets import fetch_20newsgroups
from omnixai.data.text import Text
from omnixai.preprocessing.text import Word2Id
from omnixai.explainers.nlp.specific.ig import IntegratedGradientText
We apply a simple CNN model for this text classification task. Note that the method call
has two inputs inputs
(token ids) and masks
(the sentence masks). For IntegratedGradientText
, the first input of the model must be the token ids.
[2]:
class TextModel(tf.keras.Model):
def __init__(self, num_embeddings, num_classes, **kwargs):
super().__init__()
self.num_embeddings = num_embeddings
self.embedding_size = kwargs.get("embedding_size", 50)
hidden_size = kwargs.get("hidden_size", 100)
kernel_sizes = kwargs.get("kernel_sizes", [3, 4, 5])
self.embedding = tf.keras.layers.Embedding(
num_embeddings,
self.embedding_size,
embeddings_initializer=tf.keras.initializers.RandomUniform(minval=-0.1, maxval=0.1),
name='embedding'
)
self.conv_layers = [
tf.keras.layers.Conv1D(hidden_size, k, activation='relu', padding='same')
for k in kernel_sizes
]
self.dropout = tf.keras.layers.Dropout(0.2)
self.output_layer = tf.keras.layers.Dense(num_classes)
def call(self, inputs, masks, training=False):
embeddings = self.embedding(inputs)
x = embeddings * tf.expand_dims(masks, axis=-1)
x = [tf.reduce_max(layer(x), axis=1) for layer in self.conv_layers]
x = self.dropout(tf.concat(x, axis=1)) if training \
else tf.concat(x, axis=1)
outputs = self.output_layer(x)
return outputs
We use a Text
object to represent a batch of texts/sentences. The package omnixai.preprocessing.text
provides some transforms related to text data such as Tfidf
and Word2Id
.
[3]:
# Load the training and test datasets
train_data = pd.read_csv('/home/ywz/data/imdb/labeledTrainData.tsv', sep='\t')
n = int(0.8 * len(train_data))
x_train = Text(train_data["review"].values[:n])
y_train = train_data["sentiment"].values[:n].astype(int)
x_test = Text(train_data["review"].values[n:])
y_test = train_data["sentiment"].values[n:].astype(int)
class_names = ["negative", "positive"]
# The transform for converting words/tokens to IDs
transform = Word2Id().fit(x_train)
The preprocessing function converts a batch of texts into token IDs and the masks. The outputs of the preprocessing function must fit the inputs of the model.
[4]:
max_length = 256
def preprocess(X: Text):
samples = transform.transform(X)
max_len = 0
for i in range(len(samples)):
max_len = max(max_len, len(samples[i]))
max_len = min(max_len, max_length)
inputs = np.zeros((len(samples), max_len), dtype=int)
masks = np.zeros((len(samples), max_len), dtype=np.float32)
for i in range(len(samples)):
x = samples[i][:max_len]
inputs[i, :len(x)] = x
masks[i, :len(x)] = 1
return inputs, masks
We now train the CNN model and evaluate its performance.
[5]:
learning_rate=1e-3
batch_size=128
num_epochs=10
model = TextModel(
num_embeddings=transform.vocab_size,
num_classes=len(class_names)
)
inputs, masks = preprocess(x_train)
optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate)
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
train_dataset = tf.data.Dataset.from_tensor_slices((inputs, masks, y_train))
train_dataset = train_dataset.shuffle(buffer_size=1024).batch(batch_size)
for epoch in range(num_epochs):
for step, (ids, masks, labels) in enumerate(train_dataset):
with tf.GradientTape() as tape:
logits = model(ids, masks, training=True)
loss = loss_fn(labels, logits)
grads = tape.gradient(loss, model.trainable_weights)
optimizer.apply_gradients(zip(grads, model.trainable_weights))
if step % 200 == 0:
print(f"Training loss at epoch {epoch}, step {step}: {float(loss)}")
Training loss at epoch 0, step 0: 0.6866752505302429
Training loss at epoch 1, step 0: 0.4109169542789459
Training loss at epoch 2, step 0: 0.21237820386886597
Training loss at epoch 3, step 0: 0.1540527492761612
Training loss at epoch 4, step 0: 0.08126655220985413
Training loss at epoch 5, step 0: 0.02999718114733696
Training loss at epoch 6, step 0: 0.008433952927589417
Training loss at epoch 7, step 0: 0.009998280555009842
Training loss at epoch 8, step 0: 0.0030068857595324516
Training loss at epoch 9, step 0: 0.001554026734083891
[6]:
inputs, masks = preprocess(x_test)
outputs = model(inputs, masks).numpy()
predictions = np.argmax(outputs, axis=1)
print('Test accuracy: {}'.format(
sklearn.metrics.f1_score(y_test, predictions, average='binary')))
Test accuracy: 0.8560798903465829
To initialize IntegratedGradientText
, we need to set the following parameters:
model
: The model to explain, whose type istf.keras.Model
ortorch.nn.Module
.embedding_layer
: The embedding layer in the model, which can betf.keras.layers.Layer
ortorch.nn.Module
.preprocess_function
: The pre-processing function that converts the raw input data into the inputs ofmodel
. The first output ofpreprocess_function
should be the token ids.mode
: The task type, e.g.,classification
orregression
.id2token
: The mapping from token ids to tokens.
[7]:
explainer = IntegratedGradientText(
model=model,
embedding_layer=model.embedding,
preprocess_function=preprocess,
id2token=transform.id_to_word
)
x = Text([
"What a great movie! if you have no taste.",
"it was a fantastic performance!",
"best film ever",
"such a great show!",
"it was a horrible movie",
"i've never watched something as bad"
])
explanations = explainer.explain(x)
explanations.ipython_plot(class_names=class_names)