Tree-Specific

Tree-Specific Explanations

Let $BT$ be a Boosted Tree composed of {$T_1,\ldots T_n$} regression trees and $x$ be an instance.

A worst instance extending $t$ given $F$ is an instance $x’$ such that $t \subseteq t_{x’}$ and: $x' = \mathit{argmin}_{x'': t \subseteq t_{x''}}(\{w(F, x'')\})$
A best instance extending $t$ given $F$ is an instance $x’$ such that $t \subseteq t_{x’}$ and: $x' = \mathit{argmax}_{x'': t \subseteq t_{x''}}(\{w(F, x'')\})$

$W(t, F)$ (resp. $B(t, F)$) denotes the set of worst (resp. best) instances covered by $t$ given $F$, and $w_\downarrow(t, F)$ (resp. $w_\uparrow(t, F)$) denotes the weight of the worst (resp. best) instance covered by $t$ given $F$. For example, for the sufficient reason $t$ = ($A_1 > 2$, $A_4 = 1$) for the instance $x$ = ($A_1=4$, $A_2 = 3$, $A_3 = 1$, $A_4 = 1$) of the sufficient reason page, a worst instance can be $x’$ = ($A_1=4$, $A_2 = 1$, $A_3 = 1$, $A_4 = 1$) with $w(F, x’) = 0.3$ . Identifying a worst (resp. best) instance $x$ for given a boosted tree is intractable. Interestingly, when the classifier consists of a regression tree, identifying an element of $W(t, T)$ (resp. $B(t, T)$) for a tree $T$ is easy.

We are now ready to define the notion of tree-specific explanation $t$ for an instance $x$ given a boosted tree $BT$. We start with the binary case, i.e., when $BT$ consists of a single forest $F$:

If $BT(x) = 1$, then $t$ is a tree-specific explanation for $x$ given $F$ if and only if $t$ is a subset of $t_{x}$ such that $\sum_{k=1}^p w_\downarrow(t, T_k) > 0$ and no proper subset of $t$ satisfies the latter condition.
If $BT(x) = 0$, then $t$ is a tree-specific explanation for $x$ given $F$ if and only if $t$ is a subset of $t_{x}$ such that $\sum_{k=1}^p w_\uparrow(t, T_k) \leq 0$ and no proper subset of $t$ satisfies the latter condition.

More generally, in the multi-class setting, TS-explanations can be defined as follows:

Let $BT = {F^1,\cdots, F^{m}}$ be a boosted tree over ${A_1, \ldots, A_n}$ where each $F^j$ ($j \in [m]$) contains $p_j$ trees, and $x$ such that $BT(x) = i$. Conceptually, $t$ is a tree-specific explanation for $x$ given $BT$ if and only if $t$ is a subset of $t_{x}$ such that for every $j \in [m] \setminus {i}$, we have $\sum_{k=1}^{p_i} w_\downarrow(t, T_k^i) > \sum_{k=1}^{p_j} w_\uparrow(t, T_k^j)$, and no proper subset of $t$ satisfies the latter condition.

In general, the notions of tree-specific explanation and of sufficient reason do not coincide. Indeed, a sufficient reason is a prime implicant (covering $x$) of the Boosted Tree $BT$, while a tree-specific explanation is an implicant $t$ (covering $x$). Since there is a simple, linear-time algorithm for computing $w_\downarrow(t, F)$ and $w_\uparrow(t, F)$ and for deriving a worst-case and a best-case instance, the tree-specific explanations for $x$ are much easier to compute than the sufficient reasons and remain abductive.

More information about tree-specific explanations can be found in the paper Computing Abductive Explanations for Boosted Trees.

<ExplainerBT Object>.tree_specific_reason(*, n_iterations=50, time_limit=None, seed=0):
This method calls a greedy algorithm to compute a tree-specific explanation. The algorithm is run n_iterations times and the smallest tree specific reason that has been computed is returned. Excluded features are supported. The reasons are in the form of binary variables, you must use the `to_features` method if you want to obtain a representation based on the features considered at start.
time_limit `Integer` `None`: The time limit of the method in seconds. Sets this to `None` to give this process an infinite amount of time. Default value is `None`.
n_iterations `Integer`: The number of iterations done by the greedy algorithm. Default value is 50.
seed `Integer`: The seed when the greedy algorithm is used. Set to 0 this parameter in order to use a random seed. Default value is 0

A natural way of improving the quality of tree-specific explanations is to seek for the most parsimonious ones. A minimal tree-specific explanation for $x$ is a tree-specific explanation for $x$ that contains a minimal number of literals. In other words, a minimal tree-specific explanation has a minimal size.

<ExplainerBT Object>.minimal_tree_specific_reason(*, n=1, time_limit=None):
This method considers an optimisation problem (COP) representing the boosted tree encoded using PyCSP. To solve it, several calls to a CP solver (ACE) are performed and the result of each call is a minimal tree-specific explanation. The method prevents from finding the same explanation twice or more by adding constraints between each invocation. Returns n minimal tree-pecific explanations of the current instance in a Tuple (when n is set to 1, does not return a Tuple but just the reason). Supports the excluded features. The reasons are in the form of binary variables, you must use the `to_features` method if you want to obtain a representation based on the features considered at start.
n `Integer`: The number of minimal tree specific reasons computed. Default value is 1.
time_limit `Integer` `None`: The time limit of the method in seconds. Set this to `None` to give this process an infinite amount of time. Default value is `None`.

The PyXAI library also provides a way to test that an explanation is actually a tree-specific explanation:

One can also compute preferred tree-specific explanations. Indeed, the user may prefer explanations containing some features and can provide weights in order to specify preferences. Please take a look to the Preferences page for more information on preference handling.

<ExplainerBT Object>.prefered_tree_specific_reason(*, method, n=1, time_limit=None, weights=None):
This method considers an optimisation problem (COP) representing the boosted tree encoded using PyCSP. To solve it, several calls to a CP solver (ACE) are performed and the result of each call depending the `method` used. IIf the method is `PreferredReasonMethod.WEIGHTS` then weights are given by the parameter `weights`, otherwise this parameter is useless. If the method is `PreferredReasonMethod.INCLUSION_PREFERRED` then the partition of features is given by the parameter features_partition, otherwise this parameter is useless. Returns `n` preferred tree-specific explanations for the current instance in a Tuple (when `n` is set to 1, does not return a Tuple but just the reason). Supports the excluded features. The reasons are in the form of binary variables, use to_features method if you want to obtain a representation based on the features considered at start.
method `PreferredReasonMethod.WEIGHTS` `PreferredReasonMethods.SHAPLEY` `PreferredReasonMethod.FEATURE_IMPORTANCE` `PreferredReasonMethod.WORD_FREQUENCY`: The method used to discriminate features.
time_limit `Integer` `None`: The time limit of the method in seconds. Set this to `None` to give this process an infinite amount of time. Default value is `None`.
n `Integer`: The number of tree specific reasons computed. Currently n=1 or n=Exmplainer.ALL is only supported. Default value is 1.
weights `List`: The weights (list of floats, one per feature, used to discriminate features. Only useful when `method` is `PreferredReasonMethod.WEIGHTS`. Default value is `None`.

<ExplainerBT Object>.is_tree_specific_reason(reason):
This method checks whether an explanation is a tree-specific one. To do it, we first call the method `is_implicant` to check whether this explanation leads to the correct prediction or not.
reason `List` of `Integer`: The explanation to be checked.

Example from Hand-Crafted Trees

For this example, we take the Boosted Tree of the Building Models page.

The next figure represents a Boosted Tree using $4$ features ($A_1$, $A_2$, $A_3$ and $A_4$), where $A_1$ and $A_2$ are numerical, $A_3$ is categorical and $A_4$ is a Boolean. In this figure, we show worst instances and the corresponding weights (in dark red) of the sufficient reason $t$ = ($A_1 = 4$, $A_4 = 1$) for the instance $x$ = ($A_1 = 4$, $A_2 = 3$, $A_3 = 1$, $A_4 = 1$).

BTts1

This sufficient reason $t$ = ($A_1 = 4$, $A_4 = 1$) is not a tree-specific explanation for $x$ given $BT$. Indeed, we have $w_\downarrow(t, T_1) = -0.3$, $w_\downarrow(t, T_2) = 0.3$, and $w_\downarrow(t, T_3) = -0.4$, hence $w_\downarrow(t, T_1) + w_\downarrow(t, T_2) + w_\downarrow(t, T_3) = -0.4 < 0$ while $w(F, x) = 0.9 > 0$.

In the next figure, in this figure, $t’ = (A_2 > 1, A_4 = 1)$ is a tree-specific explanation for $x$ given $BT$.

BTts2

We have $w_\downarrow(t’, T_1) = -0.3$, $w_\downarrow(t’, T_2) = 0.5$, and $w_\downarrow(t’, T_3) = 0.1$, hence $w_\downarrow(t’, T_1) + w_\downarrow(t’, T_2) + w_\downarrow(t’, T_3) = 0.3 > 0$. It can be verified that $t’$ also is a sufficient reason for $x$ given $BT$.

We show now how to get them with PyXAI. We start by building the Boosted Tree:

from pyxai import Builder, Explainer

node1_1 = Builder.DecisionNode(1, operator=Builder.GT, threshold=2, left=-0.2, right=0.3)
node1_2 = Builder.DecisionNode(3, operator=Builder.EQ, threshold=1, left=-0.3, right=node1_1)
node1_3 = Builder.DecisionNode(2, operator=Builder.GT, threshold=1, left=0.4, right=node1_2)
node1_4 = Builder.DecisionNode(4, operator=Builder.EQ, threshold=1, left=-0.5, right=node1_3)
tree1 = Builder.DecisionTree(4, node1_4)

node2_1 = Builder.DecisionNode(4, operator=Builder.EQ, threshold=1, left=-0.4, right=0.3)
node2_2 = Builder.DecisionNode(1, operator=Builder.GT, threshold=2, left=-0.2, right=node2_1)
node2_3 = Builder.DecisionNode(2, operator=Builder.GT, threshold=1, left=node2_2, right=0.5)
tree2 = Builder.DecisionTree(4, node2_3)

node3_1 = Builder.DecisionNode(1, operator=Builder.GT, threshold=2, left=0.2, right=0.3)
node3_2_1 = Builder.DecisionNode(1, operator=Builder.GT, threshold=2, left=-0.2, right=0.2)
node3_2_2 = Builder.DecisionNode(4, operator=Builder.EQ, threshold=1, left=-0.1, right=node3_1)
node3_2_3 = Builder.DecisionNode(4, operator=Builder.EQ, threshold=1, left=-0.5, right=0.1)
node3_3_1 = Builder.DecisionNode(2, operator=Builder.GT, threshold=1, left=node3_2_1, right=node3_2_2)
node3_3_2 = Builder.DecisionNode(2, operator=Builder.GT, threshold=1, left=-0.4, right=node3_2_3)
node3_4 = Builder.DecisionNode(3, operator=Builder.EQ, threshold=1, left=node3_3_1, right=node3_3_2)
tree3 = Builder.DecisionTree(4, node3_4)

BT = Builder.BoostedTrees([tree1, tree2, tree3], n_classes=2)

We compute a tree-specific explanation for this instance:

explainer = Explainer.initialize(BT)
explainer.set_instance((4,3,2,1))

tree_specific = explainer.tree_specific_reason()
print("target_prediction:", explainer.target_prediction)
print("tree specific:", tree_specific)
assert explainer.is_tree_specific_reason(tree_specific)

minimal = explainer.minimal_tree_specific_reason()
print("minimal:", minimal)
assert minimal == (1, 2), "The minimal reason is not good !"

target_prediction: 1
tree specific: (1, 2)
minimal: (1, 2)

Example from a Real Dataset

For this example, we take the compas.csv dataset. We create a model using the hold-out approach (by default, the test size is set to 30%) and select a well-classified instance.

from pyxai import Learning, Explainer

learner = Learning.Xgboost("../../../dataset/compas.csv", learner_type=Learning.CLASSIFICATION)
model = learner.evaluate(method=Learning.HOLD_OUT, output=Learning.BT)
instance, prediction = learner.get_instances(model, n=1, correct=True)

data:
      Number_of_Priors  score_factor  Age_Above_FourtyFive   
0                    0             0                     1  \
1                    0             0                     0   
2                    4             0                     0   
3                    0             0                     0   
4                   14             1                     0   
...                ...           ...                   ...   
6167                 0             1                     0   
6168                 0             0                     0   
6169                 0             0                     1   
6170                 3             0                     0   
6171                 2             0                     0   

      Age_Below_TwentyFive  African_American  Asian  Hispanic   
0                        0                 0      0         0  \
1                        0                 1      0         0   
2                        1                 1      0         0   
3                        0                 0      0         0   
4                        0                 0      0         0   
...                    ...               ...    ...       ...   
6167                     1                 1      0         0   
6168                     1                 1      0         0   
6169                     0                 0      0         0   
6170                     0                 1      0         0   
6171                     1                 0      0         1   

      Native_American  Other  Female  Misdemeanor  Two_yr_Recidivism  
0                   0      1       0            0                  0  
1                   0      0       0            0                  1  
2                   0      0       0            0                  1  
3                   0      1       0            1                  0  
4                   0      0       0            0                  1  
...               ...    ...     ...          ...                ...  
6167                0      0       0            0                  0  
6168                0      0       0            0                  0  
6169                0      1       0            0                  0  
6170                0      0       1            1                  0  
6171                0      0       1            0                  1  

[6172 rows x 12 columns]
--------------   Information   ---------------
Dataset name: ../../../dataset/compas.csv
nFeatures (nAttributes, with the labels): 12
nInstances (nObservations): 6172
nLabels: 2
---------------   Evaluation   ---------------
method: HoldOut
output: BT
learner_type: Classification
learner_options: {'seed': 0, 'max_depth': None, 'eval_metric': 'mlogloss'}
---------   Evaluation Information   ---------
For the evaluation number 0:
metrics:
   accuracy: 66.73866090712744
nTraining instances: 4320
nTest instances: 1852

---------------   Explainer   ----------------
For the evaluation number 0:
**Boosted Tree model**
NClasses: 2
nTrees: 100
nVariables: 42

---------------   Instances   ----------------
number of instances selected: 1
----------------------------------------------

We compute a tree-specific explanation and a minimal one. Since finding a minimal one explanation is time consuming, we set a time limit. In the case of the time limit is reached, we may obtain only an approximation of a minimal explanation.

explainer = Explainer.initialize(model, instance)
print("instance:", instance)
print("prediction:", prediction)
print()
tree_specific_reason = explainer.tree_specific_reason()
#for s in sufficient_reasons:
print("\ntree specific reason:", tree_specific_reason)
print("len tree specific reason:", len(tree_specific_reason))
print("to features", explainer.to_features(tree_specific_reason))
print("is tree specific reason: ", explainer.is_tree_specific_reason(tree_specific_reason))
print()
minimal = explainer.minimal_tree_specific_reason(time_limit=5)
print("\nminimal:", minimal)
print("minimal:", len(minimal))
print("to features", explainer.to_features(tree_specific_reason))
print("is majoritary reason: ", explainer.is_tree_specific_reason(tree_specific_reason))
if explainer.elapsed_time == Explainer.TIMEOUT: 
    print("time out... It is an approximation")

instance: [0 0 1 0 0 0 0 0 1 0 0]
prediction: 0


tree specific reason: (-1, -2, -3, -4, 5, -6, -7, -8, -9, -10, -11, -14, 15, -19, -20, -21, -25, -26, -29, -32, -34, -37, -38, -40)
len tree specific reason: 24
to features ('Number_of_Priors < 0.5', 'score_factor < 0.5', 'Age_Above_FourtyFive >= 0.5', 'Age_Below_TwentyFive < 0.5', 'African_American < 0.5', 'Asian < 0.5', 'Hispanic < 0.5', 'Native_American < 0.5', 'Other >= 0.5', 'Female < 0.5', 'Misdemeanor < 0.5')
is tree specific reason:  True


minimal: (-1, -2, -3, -4, 5, -6, -7, -8, -9, -10, -11, 15, -16, -18, -19, -21, -28, -29, -32, -33, -34, -37)
minimal: 22
to features ('Number_of_Priors < 0.5', 'score_factor < 0.5', 'Age_Above_FourtyFive >= 0.5', 'Age_Below_TwentyFive < 0.5', 'African_American < 0.5', 'Asian < 0.5', 'Hispanic < 0.5', 'Native_American < 0.5', 'Other >= 0.5', 'Female < 0.5', 'Misdemeanor < 0.5')
is majoritary reason:  True
time out... It is an approximation

Other types of explanations are presented in the Explanations Computation page.