L2X (learning to explain) for text classification

This is an example of the L2X explainer on text classification. Different from gradient-based methods, L2X trains a separate explanation model. The advantage of L2X is that it generates explanations fast after the explanation model is trained. The disadvantage is that the quality of the explanations highly depend on the trained explanation model, which can be affected by multiple factors, e.g., the network structure of the explanation model, the training hyperparameters.

For text classification, we implement the default CNN-based explanation model in omnixai.explainers.nlp.agnostic.l2x. One may implement other models by following the same interface (please refer to the docs for more details). If using this explainer, please cite the original work: “Learning to Explain: An Information-Theoretic Perspective on Model Interpretation, Jianbo Chen, Le Song, Martin J. Wainwright, Michael I. Jordan, https://arxiv.org/abs/1802.07814”.

[1]:
import numpy as np
import sklearn.ensemble
from sklearn.datasets import fetch_20newsgroups

from omnixai.data.text import Text
from omnixai.preprocessing.text import Tfidf
from omnixai.explainers.nlp.agnostic.l2x import L2XText

We use a Text object to represent a batch of texts/sentences. The package omnixai.preprocessing.text provides some transforms related to text data such as Tfidf.

[2]:
# Load the training and text datasets
categories = ['alt.atheism', 'soc.religion.christian']
newsgroups_train = fetch_20newsgroups(subset='train', categories=categories)
newsgroups_test = fetch_20newsgroups(subset='test', categories=categories)

x_train = Text(newsgroups_train.data)
y_train = newsgroups_train.target
x_test = Text(newsgroups_test.data)
y_test = newsgroups_test.target
class_names = ['atheism', 'christian']
# A TFDIF transform
transform = Tfidf().fit(x_train)

For this classification task, we train a random forest classifier with TF-IDF feature vectors.

[3]:
train_vectors = transform.transform(x_train)
test_vectors = transform.transform(x_test)
model = sklearn.ensemble.RandomForestClassifier(n_estimators=500)
model.fit(train_vectors, y_train)
predict_function = lambda x: model.predict_proba(transform.transform(x))

predictions = model.predict(test_vectors)
print('Test accuracy: {}'.format(
    sklearn.metrics.f1_score(y_test, predictions, average='binary')))
Test accuracy: 0.925233644859813

To initialize L2XText, we need to set the following parameters:

  • training_data: The data used to train the explainer. training_data should be the training dataset for training the machine learning model.

  • predict_function: The prediction function corresponding to the model to explain. When the model is for classification, the outputs of the predict_function are the class probabilities. When the model is for regression, the outputs of the predict_function are the estimated values.

  • mode: The task type, e.g., classification or regression.

  • selection_model: A pytorch model class for estimating P(S|X) in L2X. If selection_model = None, a default model DefaultSelectionModel will be used.

  • prediction_model: A pytorch model class for estimating Q(X_S) in L2X. If prediction_model = None, a default model DefaultPredictionModel will be used.

[4]:
idx = 83
explainer = L2XText(
    training_data=x_train,
    predict_function=predict_function
)
explanations = explainer.explain(x_test[idx:idx+9])
explanations.ipython_plot(class_names=class_names)
 |████████████████████████████████████████| 100.0% Complete, Loss 0.0039
L2X prediction model accuracy: 0.8674698795180723
Instance 0: Class atheism
from subject another request for darwin fish organization university of new mexico albuquerque lines 11 hello there have been some notes recently asking where to obtain the darwin fish this is the same question i have and i have not seen an answer on the net if anyone has a contact please post on the net or email me thanks john or

Instance 1: Class christian
from dan a rose arose subject re national repentance organization my own views here lines 63 mcovingt michael covington writes i heard on the radio today about a christian student conference where christians were called to repent of america national sins such as sexual promiscuity to which i reply there how can i repent of sin i ca and when i claim to repent of someone else sin am i not in fact him jesus equipped us to judge activities but warned us not to judge people judge not that ye be not judged lewis made the same point in an essay after world war ii when some christian leaders in britain were urging national repentance for the horrors sins of world war ii i see your point but i can not more strongly disagree to repent means to turn around we as a nation have behaved incredibly toward god encouraging and even forcing folks to participate in activity directly opposed to the written word of god we have set our nation far above the god who created it and allowed us the luxury of living in this land we have set a bad example for other nations we slaughtered children by the millions we have stricken the name of god from the we god out of the honor due him at every turn and we owe god an apology every bit as public as our sins have been when jesus said judge not that ye be not

Instance 2: Class atheism
from decay subject re about the bible quiz answers organization at t distribution na lines 18 in article healta tammy r healy writes 12 the 2 are on the of the covenant when god said make no image he was refering to idols which were created to be worshipped the of the covenant was and only the high priest could enter the holy of where it was kept once a year on the day of atonement i am not familiar with or knowledgeable about the original language but i believe there is a word for and that the translator would have used the word instead of image had the original said so i think you wrong here but then again i could be too i just suggesting a way to determine whether the interpretation you offer is correct dean

Instance 3: Class atheism
from danb dan e babcock subject re some thoughts organization communications company voice data lines 19 in article bil bill conner writes james felder spbach wrote logic alert argument from incredulity just because it is hard for you to believe this does mean that it is true liars can be very pursuasive just look at koresh that you yourself cite this is whole basis of a great many here rejecting the christian account of things in the words of madalyn face it folks it just silly why is it okay to disbelieve because of your incredulity if you admit that it a fallacy it is and i was aware that this was a reader of so that does support your assertion that the argument is the whole basis of a great many here rejecting dan

Instance 4: Class atheism
from mam mike mcangus subject re americans and evolution organization the cat is on the mat tin version pl9 lines 53 on tue 20 apr 1993 gmt bil bill conner wrote robert singleton bobs wrote sure it is mutually exclusive but it lends weight to increases notional running estimates of the posterior probability of the atheist pitch in the partition and thus necessarily reduces the same quantity in the theist pitch this is because the divine component falls prey to ockham razor the phenomenon being satisfactorily explained without it and there being no independent evidence of any such component more detail in the next post occam razor is not a law of nature it is way of analyzing an argument even so it interesting how often it cited here and to what end it seems odd that religion is simultaneously condemned as being primitive and unscientific and childish and yet again condemned as being too complex razor the scientific explanation of things being much more straightforeward and apparently simpler cute characterization bill however there is no inconsistency between the two statements even if one believes that religion is primitive and unscientific and childish one can still hold the view that religion also adds an unnecessary level of complexity to the explanation the ideas themselves do have to be complex before being by occam razor they only have to add unnecessarily to the overall complexity of the description which is it to be which is the and how do you know i think the part of

Instance 5: Class christian
from sfp sheila patterson subject re mary assumption organization cornell university cit lines 22 in article mpaul marxhausen paul writes feeling that the assumption of mary would be better phrased our text deleted i also do see the of saying the holy parents were how sanctified beyond normal humanity it sounds like our own inability to grasp the of god grace in being incarnated through an human being text deleted paul marxhausen thank you very much paul i have always been impressed by the very of mary that god chose a woman like me to bring into this world the incarnation of himself proves to me that this god is my god he down from his perfection to touch me ah the wonder of it all sheila patterson cit support group cornell university ithaca ny

Instance 6: Class christian
from mark subject re boston c of c organization at t lines 27 aside to the moderator in article rick granberry writes see below i wo quote any of it but there are several errors in the article not things that are just differences of opinion but the writer just plain has his facts confused for example was to come to the lexington church by the leaders there he brought no team he actually had been in il up to that point he had many friends even leaders in telling him not to go because people in the northeast were open and he be wasting his time and talents really this fact was a kind of inside joke at one point after the church in boston took off so well not open indeed i could take it on point by point but i am not in a position to know one way or the other about some things in the article i just wanted to point out that it contains misinformation mark mark opinions not at t sun ok next mail

Instance 7: Class atheism
from marshall kevin marshall subject re death penalty was re political atheists organization virginia tech computer science dept blacksburg va lines 46 bil bill conner writes this is fascinating atheists argue for abortion defend homosexuality as a means of population control insist that the only values are biological and condemn war and capital punishment according to benedikt if something is contardictory it can not exist which in this case means atheists i suppose i would like to understand how an atheist can object to war an excellent means of controlling population growth or to capital punishment i sorry but the logic escapes me first you seem to assume all atheists think alike an atheist does not believe in the existence of a god our opinions on issues such as capital punishment and abortion however vary greatly if you were attacking the views of a particular atheist benedikt i presume then please present your argument as such and do not us all together as for the issues let start with abortion personally i do not support abortion as a means of population control or however i support the right of any woman to have an abortion regardless of what my personal views may be because it would be arrogant of me to tell any individual what may or may not do to body and the domain of should not extend into the that my opinion and i am sure many atheists and theists would disagree with me i do not defend homosexuality as a

Instance 8: Class christian
from jcj jcj subject re why do people become atheists organization huh whuzzat lines 15 in article muirm maxwell c muir writes i think you should give up the in all candor i would be happy to be proven wrong problem is i will have to be wrong do i sound broken to you absolutely not i went through a journey of lukewarm christianity agnosticism atheism agnosticism and now although i know my faith is less than what it should be christianity again i think it a path many of us take jeff johnson jcj