{ "cells": [ { "cell_type": "markdown", "id": "b1db8c9a", "metadata": {}, "source": [ "# Contrastive Reasons" ] }, { "cell_type": "markdown", "id": "514a5144", "metadata": {}, "source": [ "Unlike abductive explanations that explain why an instance $x$ is classified as belonging to a given class, **contrastive explanations** explain why $x$ has not been classified by the ML model as expected.\n", "\n", "Let $f$ be a Boolean function represented by a decision tree $T$, $x$ be an instance and $p$ the prediction of $T$ on $x$ ($f(x) = p$), a **contrastive reason** for $x$ is a term $t$ such that:\n", "* $t \\subseteq t_{x}$, $t_{x} \\setminus t$ is not an implicant of $f$ ; \n", "* for every $\\ell \\in t$, $t \\setminus \\{\\ell\\}$ does not satisfy this previous condition (that is, $t$ is minimal w.r.t. set inclusion).\n", "\n", "Formally, a **contrastive reason** for $x$ is a subset $t$ of the characteristics of $x$ that is minimal w.r.t. set inclusion among those such that at least one instance $x'$ that coincides with $x$ except on the characteristics from $t$ is not classified by the decision tree as $x$ is. In a simple way, a **contrastive reason** represents the adjustments in the features that we have to do to change the prediction for an instance. \n", "\n", "A contrastive reason is minimal w.r.t. set inclusion, i.e., there is no subset of this reason which is also a contrastive reason. A **minimal contrastive reason** for $x$ is a contrastive reason for $x$ that contains a minimal number of literals. In other words, a **minimal contrastive reason** has a minimal size.\n", "\n", "PyXAI provides two methods for contrastive reasons for Decision Trees: \n", "\n", " - ```contrastive_reason```.\n", " - ```is_contrastive_reason```.\n", "\n", "More information about contrastive reasons can be found in the paper [On the Explanatory Power of Decision Trees](https://arxiv.org/abs/2108.05266)." ] }, { "cell_type": "markdown", "id": "508e75e2", "metadata": {}, "source": [ "{: .attention }\n", "> As the ```contrastive_reason``` returns the contrastive reasons in a ascending order according to their sizes, the minimal contrastive reasons are the first ones in the returned tuple. " ] }, { "cell_type": "markdown", "id": "2b240b8f", "metadata": {}, "source": [ "The basic methods ([``initialize``](/documentation/api/modules/explaining/), ```set_instance```, ```to_features```, ```is_reason```, ...) of the ```Explainer``` module used in the next examples are described in the [Explainer Principles](/documentation/explainer/) page." ] }, { "cell_type": "markdown", "id": "c8f0eead", "metadata": {}, "source": [ "## Example from a Hand-Crafted Tree" ] }, { "cell_type": "markdown", "id": "ad910b80", "metadata": {}, "source": [ "For this example, we take the Decision Tree of the [Building Models](/documentation/learning/builder/DTbuilder/) page consisting of $4$ binary features ($x_1$, $x_2$, $x_3$ and $x_4$). \n", "\n", "The following figure shows the new instances (respectively, $(1,1,1,0)$, $(0,0,1,1)$ and $(0,1,0,1)$) created from the contrastive reasons $(x_4)$ in red, $(x_1, x_2)$ in blue and $(x_1, x_3)$ in green of the instance $(1,1,1,1)$. Thus, the instance $(1,1,1,0)$ (resp. $(0,0,1,1)$ and $(0,1,0,1)$) that differs with $x$ only on $x_4$ (resp. $(x_1, x_2)$ and $(x_1, x_3)$) is not classified by $T$ as $x$ is ($(1,1,1,0)$, $(0,0,1,1)$ and $(0,1,0,1)$ are classified as negative instances while $(1,1,1,1)$ is classified as a positive instance).\n", "\n", "\"DTcontrastive\"\n", "\n", " Now, we show how to get them with PyXAI. We start by building the decision tree: " ] }, { "cell_type": "code", "execution_count": 1, "id": "a7a88ebf", "metadata": {}, "outputs": [], "source": [ "from pyxai import Builder, Explaining\n", "\n", "node_x4_1 = Builder.DecisionNode(4, left=0, right=1)\n", "node_x4_2 = Builder.DecisionNode(4, left=0, right=1)\n", "node_x4_3 = Builder.DecisionNode(4, left=0, right=1)\n", "node_x4_4 = Builder.DecisionNode(4, left=0, right=1)\n", "node_x4_5 = Builder.DecisionNode(4, left=0, right=1)\n", "\n", "node_x3_1 = Builder.DecisionNode(3, left=0, right=node_x4_1)\n", "node_x3_2 = Builder.DecisionNode(3, left=node_x4_2, right=node_x4_3)\n", "node_x3_3 = Builder.DecisionNode(3, left=node_x4_4, right=node_x4_5)\n", "\n", "node_x2_1 = Builder.DecisionNode(2, left=0, right=node_x3_1)\n", "node_x2_2 = Builder.DecisionNode(2, left=node_x3_2, right=node_x3_3)\n", "\n", "node_x1_1 = Builder.DecisionNode(1, left=node_x2_1, right=node_x2_2)\n", "\n", "tree = Builder.DecisionTree(4, node_x1_1, force_features_equal_to_binaries=True)" ] }, { "cell_type": "markdown", "id": "c177fc1f", "metadata": {}, "source": [ "We compute the contrastive reasons for these two instances: " ] }, { "cell_type": "code", "execution_count": 2, "id": "f0733b41", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Contrastives: ((4,), (1, 2), (1, 3))\n", "-------------------------------\n", "Contrastives: ((-1, -4), (-2, -3, -4))\n" ] } ], "source": [ "explainer = Explaining.initialize(tree)\n", "explainer.set_instance((1,1,1,1))\n", "\n", "contrastives = explainer.contrastive_reason(n=Explaining.ALL)\n", "print(\"Contrastives:\", contrastives)\n", "for contrastive in contrastives:\n", " assert explainer.is_contrastive_reason(contrastive), \"This is have to be a contrastive reason !\"\n", "\n", "print(\"-------------------------------\")\n", "\n", "explainer.set_instance((0,0,0,0))\n", "\n", "contrastives = explainer.contrastive_reason(n=Explaining.ALL)\n", "print(\"Contrastives:\", contrastives)\n", "for contrastive in contrastives:\n", " assert explainer.is_contrastive_reason(contrastive), \"This is have to be a contrastive reason !\"" ] }, { "cell_type": "markdown", "id": "c75f8563", "metadata": {}, "source": [ "## Example from a Real Dataset" ] }, { "cell_type": "markdown", "id": "ed0ed888", "metadata": {}, "source": [ "For this example, we take the [compas](/assets/notebooks/dataset/compas.csv) dataset. We create a model using the hold-out approach (by default, the test size is set to 30%) and select a well-classified instance. " ] }, { "cell_type": "code", "execution_count": 3, "id": "bbeb5462", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-------------- Information ---------------\n", "Problem type: classification\n", "Instances type: tabular\n", "Labels type: classes\n", "\n", "Dataset path: ../../../dataset/compas.csv\n", "nFeatures (nAttributes, with the labels): 11\n", "nInstances (nObservations): 6172\n", "nLabels: 2\n", "--------------- Model creation, fitting and evaluation ---------------\n", "Splitting method: hold-out\n", "Problem type: classification\n", "Models type: decision-tree\n", "model_parameters: {}\n", "--------- Evaluation Information ---------\n", "For the evaluation number 0:\n", "Metrics:\n", " sklearn_confusion_matrix: [[649, 202], [304, 388]]\n", " precision: 65.76271186440678\n", " recall: 56.06936416184971\n", " f1_score: 60.53042121684868\n", " specificity: 76.26321974148061\n", " true_positive: 388\n", " true_negative: 649\n", " false_positive: 202\n", " false_negative: 304\n", " accuracy: 67.20674011665587\n", "Number of Training instances: 4629\n", "Number of Testing instances: 1543\n", "\n", "--------------- Explainer ----------------\n", "For the split number 0:\n", "**Decision Tree Model**\n", "nFeatures: 11\n", "nNodes: 584\n", "nVariables: 53\n", "\n", "--------------- Instances ----------------\n", "Number of instances selected: 1\n", "----------------------------------------------\n" ] } ], "source": [ "from pyxai import Learning, Explaining\n", "\n", "learner = Learning.Scikitlearn(\"../../../dataset/compas.csv\", problem_type='classification')\n", "model = learner.evaluate(splitting_method=Learning.HOLD_OUT, model_type=Learning.DT)\n", "instance, prediction = learner.get_instances(model, n=1, is_correct=True)" ] }, { "cell_type": "markdown", "id": "0bc4b271", "metadata": {}, "source": [ "We compute all the contrastives reasons for this instance: " ] }, { "cell_type": "code", "execution_count": 4, "id": "c2a2d7f1", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "instance: Misdemeanor 0\n", "Number_of_Priors 0\n", "score_factor 0\n", "Age_Above_FourtyFive 1\n", "Age_Below_TwentyFive 0\n", "African_American 0\n", "Asian 0\n", "Hispanic 0\n", "Native_American 0\n", "Other 1\n", "Female 0\n", "Name: 0, dtype: int64\n", "prediction: 0\n", "\n", "number of constractive reasons: 14\n", "all reasons are indeed contrastives.\n" ] } ], "source": [ "explainer = Explaining.initialize(model, instance)\n", "print(\"instance:\", instance)\n", "print(\"prediction:\", prediction)\n", "print()\n", "\n", "constractive_reasons = explainer.contrastive_reason(n=Explaining.ALL)\n", "print(\"number of constractive reasons:\", len(constractive_reasons))\n", "\n", "all_are_contrastive = True\n", "for contrastive in constractive_reasons:\n", " if not explainer.is_contrastive_reason(contrastive):\n", " print(f\"{contrastive} is not a contrastive reason.\")\n", " all_are_contrastive = False\n", "\n", "if all_are_contrastive: print(\"all reasons are indeed contrastives.\")" ] }, { "cell_type": "markdown", "id": "318185c1", "metadata": {}, "source": [ "Other types of explanations are presented in the [Explanations Computation](/documentation/explanations/DTexplanations/) page." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.7" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 5 }