Link Search Menu Expand Document
PyXAI
Papers Video GitHub In-the-Loop EXPEKCTATION Release Notes About
download notebook

Direct Reason

Let $T$ be a decision tree and $x$ be an instance, the direct reason for $x$ is a term of the binary representation of the instance corresponding to the unique root-to-leaf path of $T$ that is compatible with $x$. Due to its simplicity, it is one of the easiest reason to calculate, but in general it is redundant. More information about the direct reason can be found in the paper On the Explanatory Power of Decision Trees.

The basic methods initialize, set_instance, to_features, is_reason, …) of the Explainer module used in the next examples are described in the Explainer Principles page.

Example from a Hand-Crafted Tree

For this example, we take the Decision Tree of the Building Models page.

DTbuilder

This figure represents a Decision Tree using $4$ binary features ($x_1$, $x_2$, $x_3$ and $x_4$). The direct reason for the instance $(1,1,1,1)$ is in red and the one for $(0,0,0,0)$ is in blue. Now, we show how to get them with PyXAI. We start by building the decision tree:

from pyxai import Builder, Explaining

node_x4_1 = Builder.DecisionNode(4, left=0, right=1)
node_x4_2 = Builder.DecisionNode(4, left=0, right=1)
node_x4_3 = Builder.DecisionNode(4, left=0, right=1)
node_x4_4 = Builder.DecisionNode(4, left=0, right=1)
node_x4_5 = Builder.DecisionNode(4, left=0, right=1)

node_x3_1 = Builder.DecisionNode(3, left=0, right=node_x4_1)
node_x3_2 = Builder.DecisionNode(3, left=node_x4_2, right=node_x4_3)
node_x3_3 = Builder.DecisionNode(3, left=node_x4_4, right=node_x4_5)

node_x2_1 = Builder.DecisionNode(2, left=0, right=node_x3_1)
node_x2_2 = Builder.DecisionNode(2, left=node_x3_2, right=node_x3_3)

node_x1_1 = Builder.DecisionNode(1, left=node_x2_1, right=node_x2_2)

tree = Builder.DecisionTree(4, node_x1_1, force_features_equal_to_binaries=True)

And we compute the direct reasons for these two instances:

explainer = Explaining.initialize(tree)
explainer.set_instance((1,1,1,1))
direct = explainer.direct_reason()
print("instance: (1,1,1,1)")
print("binary representation:", explainer.binary_representation)
print("target_prediction:", explainer.target_prediction)
print("direct:", direct)
print("to_features:", explainer.to_features(direct))
print("------------------------------------------------")
explainer.set_instance((0,0,0,0))
direct = explainer.direct_reason()
print("instance: (0,0,0,0)")
print("binary representation:", explainer.binary_representation)
print("target_prediction:", explainer.target_prediction)
print("direct:", direct)
print("to_features:", explainer.to_features(direct))
instance: (1,1,1,1)
binary representation: (1, 2, 3, 4)
target_prediction: 1
direct: (1, 2, 3, 4)
to_features: ['f1 >= 0.5', 'f2 >= 0.5', 'f3 >= 0.5', 'f4 >= 0.5']
------------------------------------------------
instance: (0,0,0,0)
binary representation: (-1, -2, -3, -4)
target_prediction: 0
direct: (-1, -2)
to_features: ['f1 < 0.5', 'f2 < 0.5']

Example from a Real Dataset

For this example, we take the compas dataset. We create a model using the hold-out approach (by default, the test size is set to 30%) and select a well-classified instance.

from pyxai import Learning, Explaining

learner = learner = Learning.Scikitlearn("../../../dataset/compas.csv", problem_type='classification', instances_type='tabular', labels_type='classes')
model = learner.evaluate(splitting_method=Learning.HOLD_OUT, model_type=Learning.DT)
instance, prediction = learner.get_instances(model, n=1, is_correct=True)
--------------   Information   ---------------
Problem type: classification
Instances type: tabular
Labels type: classes

Dataset path: ../../../dataset/compas.csv
nFeatures (nAttributes, with the labels): 11
nInstances (nObservations): 6172
nLabels: 2
---------------   Model creation, fitting and evaluation  ---------------
Splitting method: hold-out
Problem type: classification
Models type: decision-tree
model_parameters: {}
---------   Evaluation Information   ---------
For the evaluation number 0:
Metrics:
   sklearn_confusion_matrix: [[637, 200], [327, 379]]
   precision: 65.45768566493955
   recall: 53.682719546742206
   f1_score: 58.988326848249024
   specificity: 76.10513739545998
   true_positive: 379
   true_negative: 637
   false_positive: 200
   false_negative: 327
   accuracy: 65.84575502268308
Number of Training instances: 4629
Number of Testing instances: 1543

---------------   Explainer   ----------------
For the split number 0:
**Decision Tree Model**
nFeatures: 11
nNodes: 564
nVariables: 45

---------------   Instances   ----------------
Number of instances selected: 1
----------------------------------------------

Finally, we compute the direct reason for this instance:

explainer = Explaining.initialize(model, instance)
print("instance:", instance)
print("prediction:", prediction)
print()
direct_reason = explainer.direct_reason()
print("len binary representation:", len(explainer.binary_representation))
print("len direct:", len(direct_reason))
print("is_reason:", explainer.is_reason(direct_reason))
print("to_features:", explainer.to_features(direct_reason))
instance: Number_of_Priors           0
score_factor               0
Age_Above_FourtyFive       1
Age_Below_TwentyFive       0
Origin_African_American    0
Origin_Asian               0
Origin_Hispanic            0
Origin_Native_American     0
Origin_Other               1
Female                     0
Misdemeanor                0
Name: 0, dtype: int64
prediction: 0

len binary representation: 45
len direct: 10
is_reason: True
to_features: ['Number_of_Priors <= 0.5', 'score_factor <= 0.5', 'Age_Above_FourtyFive > 0.5', 'Age_Below_TwentyFive <= 0.5', 'Origin_African_American <= 0.5', 'Origin_Hispanic <= 0.5', 'Origin_Other > 0.5', 'Female <= 0.5', 'Misdemeanor <= 0.5']

We can remark that this direct reason contains 9 binary variables out of 46 variables in the binary representation. This reason explains why the model predicts $0$ for this instance. But this is probably not the most compact abductive explanation for the instance, we invite you to take a look at the other types of reasons presented on the Decision Tree Explanations page.

See Also