Class ExplainerBT (extends Explainer)
The Boosted Trees explainer.
This class adapts the generic :class:Explainer to a BoostedTrees model. It provides methods to:
- set an instance and derive its Boolean (literal-based) representation,
- compute different families of explanations (direct, contrastive, sufficient, tree_specific),
- optionally rectify (modify) the random forest to enforce a desired label under given conditions.
Some functions come from the parent class Explainer, their documentation is available in the dedicated page:
- Main Methods:
set_instance,set_features_type,get_model,predict,activate_theory,deactivate_theory,set_excluded_features,unset_excluded_features; - Human-Readable Explanation Methods:
to_features,get_feature_names,get_feature_names_from_literal,reason_contains_features; - Explanation verification methods:
is_implicant,is_reason,is_sufficient_reason,is_contrastive_reason.
Some methods are limited to binary classification (2 classes). When multi-class support
is not available, an exceptionNotImplementedErroris raised.
See also
- The documentation pages
- The paper Computing Abductive Explanations for Boosted Trees
def __init__(self, boosted_trees, instance=None): Highlight
Initialize an explainer for a Boosted Tree model.
Parameters
boosted_trees : BoostedTrees
The Boosted Trees model to explain.
instance : list[int] (optional, default: None)
The instance (observation) for which explanations will be computed.
If None, you must call set_instance before requesting explanations.
Methods for Calculating Explanations
def coverage_reason(self, *, ordre_features=None): Highlight
Compute a coverage-based prime implicant explanation (CPI-Xp) for Boosted Trees.
A coverage reason is an abductive explanation that is maximally general with respect to the domain theory Σ^f. It is a subset t of the Boolean representation of the current instance such that t implies the target prediction modulo Σ^f, and no strictly more general explanation exists under Σ^f. The domain theory encodes logical constraints between Boolean conditions (e.g. threshold chains for numerical features, mutual exclusion for categorical features).
Unlike tree-specific reasons, a coverage reason may contain more literals because it favours generality over minimality: it always selects the widest applicable boundary under Σ^f.
The feature types must be declared at initialisation via features_type so that the domain theory Σ^f can be built; a ValueError is raised otherwise.
Parameters
ordre_features : list | None (optional, default=None) (default: order)
Explicit priority order over features for the greedy search. When None, the
default order derived from the domain theory is used.
Examples
from pyxai import Learning, Explaining
learner = Learning.Xgboost("tests/datasets/12_fault.csv", problem_type="classification")
model = learner.evaluate(splitting_method=Learning.HOLD_OUT, model_type=Learning.BT)
instance, prediction = learner.get_instances(model, n=1)
explainer = Explaining.initialize(model, instance=instance,
... features_type={"numerical": Learning.DEFAULT})
coverage = explainer.coverage_reason()
print("coverage:", explainer.to_features(coverage))
def direct_reason(self): Highlight
Compute the direct reason for the current instance.
The direct reason is obtained by collecting the root-to-leaf path conditions that cover the instance across the forest, If the result contains excluded features, None is returned.
Returns
list[int] :
A list of literals representing the direct reason
None :
if excluded features appear.
def minimal_contrastive_reason(self, *, n=1, time_limit=None): Highlight
Compute n minimal contrastive reasons for the current instance.
Contrastive explanations explain why the instance has not been classified by the ML model as expected.
Parameters
n : int | Explaining.ALL (optional, default=1)
Maximum number of contrastive reasons to return. If a Explaining.ALL is provided, all computed candidates are formatted/returned.
time_limit : float (optional, default=None)
Time limit (seconds) for the whole enumeration. If reached, elapsed_time is set to Explaining.TIMEOUT.
Returns
tuple[int] :
if n == 1, returns a single reason as a tuple of literals.
tuple[tuple[int]] :
If n > 1, returns a tuple of reasons (each one is a tuple of literals).
None :
If all reasons contain excluded features.
def minimal_coverage_reason(self): Highlight
Compute a minimal coverage-based prime implicant explanation (mCPI-Xp) for Boosted Trees.
A minimal coverage reason is a coverage reason (CPI-Xp) that is additionally subset-minimal: no literal can be removed while it remains a valid implicant modulo the domain theory Σ^f. Starting from a coverage reason returned by coverage_reason, this method greedily removes each literal and checks whether the remaining set is still a valid implicant modulo Σ^f; if so, the literal is discarded. The process repeats until no literal can be removed.
Note that a minimal coverage reason is not a classical sufficient reason: a literal that appears redundant without the theory may be necessary to define the correct coverage boundary under Σ^f.
Examples
from pyxai import Learning, Explaining
learner = Learning.Xgboost("tests/datasets/12_fault.csv", problem_type="classification")
model = learner.evaluate(splitting_method=Learning.HOLD_OUT, model_type=Learning.BT)
instance, prediction = learner.get_instances(model, n=1)
explainer = Explaining.initialize(model, instance=instance,
... features_type={"numerical": Learning.DEFAULT})
minimal = explainer.minimal_coverage_reason()
print("minimal coverage:", explainer.to_features(minimal))
Each implicant check for Boosted Trees uses a MIP solver (OR-Tools), which is
computationally expensive. This method may be slow on large models or instances
with many literals.
def minimal_tree_specific_reasons(self, *, n=10, time_limit=None): Highlight
Compute n minimal tree specific reasons for the current instance.
This method considers an Mixed Integer Programming problem (MIP) representing the boosted tree encoded. To solve it, several calls to a solver are performed and the result of each call is a minimal tree-specific explanation. The method prevents from finding the same explanation twice or more by adding constraints between each invocation.
Excluded features are not supported.
The reasons are in the form of binary variables, you must use the to_features method if you want to obtain a representation based on the features considered at start.
Parameters
n : int (default: 10)
Number of minimal tree-pecific explanations to compute.
time_limit : float (optional, default=None)
Time limit (seconds) for the whole enumeration. If reached, elapsed_time is set to Explaining.TIMEOUT.
Returns
None : tuple[int] | None
Returns n minimal tree-pecific explanations of the current instance in a Tuple (when n is set to 1, does not return a Tuple but just the reason).
This method is complete and therefore can be time-consuming.
def tree_specific_reason(self, *, n_iterations=50, time_limit=None, seed=0, history=True, theta=0): Highlight
Compute a tree specific reason for the current instance.
This method calls a greedy algorithm to compute a tree-specific explanation (that are, by definition, subset minimal for the inclusion). The algorithm is run n_iterations times and takes the smallest (according to the size of the reason) tree specific reason that has been computed is returned.
Excluded features are supported.
Parameters
n_iterations : int (optional, default: 50)
Number of randomized iterations used to extract the smallest one.
time_limit : float (optional, default=None)
Time limit (seconds) for the whole enumeration. If reached, elapsed_time is set to Explaining.TIMEOUT.
seed : int (optional, default: 0)
Random seed for the randomized iterations. Set to 0 to have a random seed.
Returns
None : tuple[int]
The tree specific reasonNone:
if computed reasons contain excluded features.
The method is heuristic and does not guarantee to return a minimal tree specific reason.
However, it is much faster than the exact method but it returns only one tree specific reason.
Explanation Verification Methods
def is_tree_specific_reason(self, reason, check_minimal_inclusion=False): Highlight
Heuristically check whether a given reason (subset of the binary representation) behaves like a tree specific reason.