Papers Video GitHub In-the-Loop EXPEKCTATION About

About the paper “Uncovering Bugs in Formal Explainers: A Case Study with PyXAI”

We have discovered in the arXiv paper [3] “Uncovering Bugs in Formal Explainers: A Case Study with PyXAI” that some implementation issues have been found in PyXAI. We have analyzed the reported issues in the objective of repairing potential bugs in PyXAI.

Analysis

Our analysis revealed the following.

On the one hand, for random forests used for binary classification, the tie-break rule used in PyXAI when the number of trees predicting the instance as positive is equal to the number of trees predicting the instance as negative was not properly implemented, actually leading to incorrect explanations in some cases (one of them is presented in section 4.2 of [3]). Those cases do not look so frequent in practice, see column %[¬WAXp] in Table 4 of [3], and would probably be less frequent if more trees were considered. Anyway, this was clearly a bug and it has been fixed in the last release of PyXAI (see commit e02012d).

On the other hand, it turns out that the definitions of abductive / contrastive explanations considered in [3] (see Equations (1) to (4)) are not the ones we used, and this explains the (numerous) discrepancies put forward in [3], especially those concerning subset-minimal explanations (column %[WAXp ∧¬AXp] in Table 4 and column %[WCXp ∧¬CXp] in Table 5). Indeed, alternative definitions to those pointed out in [3] have typically been considered in our work (see [1]): explanations based on the Boolean conditions found in the tree ensemble under consideration and not on the characteristics of the instance to be explained are looked for; furthermore, a domain theory indicating how the conditions are logically connected is exploited.

Obviously enough, the explanations computed by PyXAI may be correct given the definitions we used while being incorrect when considering instead the definitions pointed out in [3]. It is actually the case that using the right definitions changes the picture a lot as to the ‘issues’ found in [3].

To clarify it with a simple example, consider instances described using two attributes (A (numerical) and B (Boolean)) and a random forest such that the class of positive instances is characterized by the Boolean function

$((A > 5) \wedge \neg(A > 30)) \vee ((A > 5) \wedge \neg B)$.

Using our approach, the instance $(25, 0)$ (classified as positive) has two subset-minimal abductive explanations:

$((A > 5) \wedge \neg(A > 30))$
$((A > 5) \wedge \neg B)$.

Using the definitions considered in [3], the instance (25, 0) has a unique subset-minimal abductive explanation: $(A = 25)$. In particular, $((A > 5) \wedge \neg(B))$ is viewed as erroneous by the authors of [3] because of the subset-minimality requirement (attribute B is involved in the explanation, while an explanation based only on attribute A exists).

It is the case that an abductive explanation based only on attribute A exists. However, the two explanations $((A > 5) \wedge \neg(A > 30))$ and $((A > 5) \wedge \neg B)$ are incomparable in terms of generality (the first one $((A > 5) \wedge \neg(A > 30))$ is more general than $(A = 25)$ and the second one $((A > 5) \wedge \neg B)$ is incomparable with $(A = 25)$ in terms of generality). Notably, the extra-amount of generality can be useful for the debugging purpose: suppose that the predictor is used to determine whether a preset medication can be given to a patient. A is the age of the patient and B (when = 1) indicates that the patient has received a specific vaccine. A doctor could be fine with giving the medication to a patient knowing only that they are 25 years old. But if the doctor knows in addition that the medication should not be given to children, not giving the medication probably is a better decision since the decision rules associated with the explanations $((A > 5) \wedge \neg(A > 30))$, $((A > 5) \wedge \neg B)$ that have been found conflict with a medical rule known by the doctor.

Considering the characteristics of the input instance instead of the Boolean conditions in the tree ensemble is even worse when the goal is to derive contrastive explanations. Looking at the simple example above, A would be considered as a contrastive explanation using the definitions of [3]. It is the case that changing the value of A may be enough to change the prediction. However, this explanation does not indicate how to change the value to get a different prediction. Giving to A any value distinct from 25 is not enough to achieve this goal (for instance A = 26 would not work). Over the Boolean conditions used, there are two contrastive explanations that are incomparable in terms of generality: change A so that $A \leq 5$ or change both A and B so that A becomes > 30 and B becomes true. Even if it involves the two attributes A and B, the second contrastive explanation stating that the patient should wait for 5 years and should get vaccinated is probably better since the other contrastive explanation is not actionable when A represents the age of a 25 years old patient.

In addition, the use of the domain theory is crucial in our approach to ensure that the explanations that are generated are correct. Thus, in the case of random forests, the erroneous contrastive explanation {1} - actually $\neg (x_1 \leq 6.5)$ - supposedly returned by PyXAI (see Example 2, section 4.2 of [3]) is not generated by PyXAI when the domain theory is taken into account, as it should be. In this case, PyXAI returns $\neg (x_7 <= 0.411)$ which is correct. Similarly, in the case of boosted trees, the erroneous contrastive explanation {1} - supposedly returned by PyXAI (see Example 3, appendix A of [3]) is not generated by PyXAI when the domain theory is taken into account, as it should be.

It is well-known that many notions of explanation exist. The benefits of considering alternative definitions to those given in [3] for the notions of abductive / contrastive explanations and leveraging a domain theory when available have been discussed previously by several authors (see for instance, [1,2]), including authors of [3] (see [6]). Moreover, the importance of getting explanations that are as general as possible has also been acknowledged by others, including again authors of [3] (see in particular [4, 5]). Whatever the definitions used, the correctness of an algorithm cannot be evaluated in a sound way if the specification considered in the evaluation is not the right one.

That mentioned, we thank the authors of [3] for the feedback.

To all users of PyXAI: please feel free to take advantage of the ticket facility of github to point out any problems encountered with PyXAI and/or to send an email to: pyxai@cril.fr. Knowing as soon as possible that a problem has been identified allows us to address it as quickly as possible.

Github commit

The commit e02012d in the github repository of PyXAI provides:

A fix bug for the tie-breaking rule in the case of random forest.
The file builder-arxiv-2511.03169-1.py with the example in Figure 1 of [3] (related to the tie-breaking rule bug).
The file builder-arxiv-2511.03169-2.py with the example in Figure 2 of [3] (related to Random Forest contrastive explanations).
The file builder-arxiv-2511.03169-3.py with the example in Figure 3 of [3] (related to Boosted Trees contrastive explanations).

References

[1] Gilles Audemard, Jean-Marie Lagniez, Pierre Marquis, Nicolas Szczepanski. On Contrastive Explanations for Tree-Based Classifiers. ECAI 2023: 117-124

[2] Niku Gorji, Sasha Rubin. Sufficient Reasons for Classifier Decisions in the Presence of Domain Constraints. AAAI 2022: 5660-5667

[3] Xuanxiang Huang, Yacine Izza, Alexey Ignatiev, João Marques-Silva. Uncovering Bugs in Formal Explainers: A Case Study with PyXAI. CoRR abs/2511.03169 (2025)

[4] Yacine Izza, Alexey Ignatiev, Peter J. Stuckey, João Marques-Silva. Delivering Inflated Explanations. AAAI 2024: 12744-12753

[5] Yacine Izza, Alexey Ignatiev, Sasha Rubin, João Marques-Silva, Peter J. Stuckey. Most General Explanations of Tree Ensembles. IJCAI 2025: 5463-5471

[6] Jinqiang Yu, Alexey Ignatiev, Peter J. Stuckey, Nina Narodytska, João Marques-Silva. Eliminating the Impossible, Whatever Remains Must Be True: On Extracting and Applying Background Knowledge in the Context of Formal Explanations. AAAI 2023: 4123-4131