{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "b1db8c9a",
   "metadata": {},
   "source": [
    "# Sufficient Reasons"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "514a5144",
   "metadata": {},
   "source": [
    "Let $f$ be a Boolean function represented by a decision tree $T$, $x$ be an instance and $p$ be the prediction of $T$ on $x$ ($f(x) = p$), a **sufficient reason** for $x$ is a term of the binary representation of the instance that is a prime implicant of $f$ that covers $x$.\n",
    "\n",
    "In other words, a **sufficient reason** for an instance $x$ given a class described by a Boolean function $f$ is a subset $t$ of the characteristics of $x$ that is minimal w.r.t. set inclusion and such that any instance $x'$ sharing this set $t$ of characteristics is classified by $f$ as $x$ is.\n",
    "\n",
    "The function ```ExplainerDT.sufficient_reason``` allows computing this kind of explanation.\n",
    "\n",
    "The library provides a way to check that a reason is sufficient using the function ```is_sufficient_reason```."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6119a2c9-7ff8-4d6f-816e-cf85251fb753",
   "metadata": {},
   "source": [
    "### Minimal Sufficient Reason"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b326e678",
   "metadata": {},
   "source": [
    "A sufficient reason is minimal w.r.t. set inclusion, i.e. there is no subset of this reason which is also a sufficient reason. A **minimal sufficient reason** for $x$ is a sufficient reason for $x$ that\n",
    "contains a minimal number of literals. In other words, a **minimal sufficient reason** has a minimal size. \n",
    "\n",
    "The function ```ExplainerDT.minimal_sufficient_reason``` allows computing this kind of explanation."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0b280433-c872-4012-9d3b-56b311f025ca",
   "metadata": {},
   "source": [
    "### Preferences over  Sufficient Reasons"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f38f0ac8",
   "metadata": {},
   "source": [
    "One can also compute preferred sufficient reasons. Indeed, the user may prefer reason containing some features and can provide weights in order to discriminate some features. Please take a look to the [Preferences](/documentation/explainer/preferences/) page for more information.\n",
    "\n",
    "The function ```preferred_sufficient_reason``` allows computing this kind of explanation."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8d9c8f14-fdaa-435b-8c0d-a3b44fed5247",
   "metadata": {},
   "source": [
    "### Other methods"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c708202d",
   "metadata": {},
   "source": [
    "Reminder that the literals of a binary representation represent the conditions \"\\<id_feature\\> \\<operator\\> \\<threshold\\> ?\" (such as \"$x_4 \\ge 0.5$ ?\") implied by an instance. A literal $l$ of a binary representation is a **necessary feature** for $x$ if and only if $l$ belongs to every sufficient reason $t$ for $x$. In contrast, a literal $l$ of a binary representation is a **relevant feature** for $x$ if and only if $l$ belongs to at least one sufficient reason $t$ for $x$. \n",
    "\n",
    "PyXAI provides methods to compute them : \n",
    "\n",
    " - ```necessary_literals```.\n",
    " - ```relevant_literals```."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6d4deafe",
   "metadata": {},
   "source": [
    "For a given instance, it can be interesting to compute the number of sufficient reasons or the number of sufficient reasons per literal of the binary representation. PyXAI allows this: \n",
    "\n",
    " - ```n_sufficient_reasons```.\n",
    " - ```n_sufficient_reasons_per_attribute```."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "273580b5",
   "metadata": {},
   "source": [
    "More information about sufficient reasons and minimal sufficient reasons can be found in the paper [On the Explanatory Power of Decision Trees](https://arxiv.org/abs/2108.05266).\n",
    "The basic methods  ([``initialize``](/documentation/api/modules/explaining/), ```set_instance```, ```to_features```, ```is_reason```, ...) of the ```Explainer``` module used in the next examples are described in the [Explainer Principles](/documentation/explainer/) page."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "869cba3c",
   "metadata": {},
   "source": [
    "## Example from a Hand-Crafted Tree"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6a557012",
   "metadata": {},
   "source": [
    "For this example, we take the Decision Tree of the [Building Models](/documentation/learning/builder/DTbuilder/) page consisting of $4$ binary features ($x_1$, $x_2$, $x_3$ and $x_4$). \n",
    "\n",
    "The following figure shows in red and bold a minimal sufficient reason $(x_1, x_4)$ for the instance $(1,1,1,1)$. \n",
    "<img src=\"attachment:DTsufficient1.png\" alt=\"DTbuilder\" width=\"600\" />\n",
    "\n",
    "The next figure gives in blue and bold a minimal sufficient reason $(-x_4)$ for the instance $(0,0,0,0)$. \n",
    "<img src=\"attachment:DTsufficient2.png\" alt=\"DTbuilder\" width=\"600\" />\n",
    "\n",
    " We now show how to get those reasons with PyXAI. We start by building the decision tree: "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "745fbf2c",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pyxai import Builder, Explaining\n",
    "\n",
    "node_x4_1 = Builder.DecisionNode(4, left=0, right=1)\n",
    "node_x4_2 = Builder.DecisionNode(4, left=0, right=1)\n",
    "node_x4_3 = Builder.DecisionNode(4, left=0, right=1)\n",
    "node_x4_4 = Builder.DecisionNode(4, left=0, right=1)\n",
    "node_x4_5 = Builder.DecisionNode(4, left=0, right=1)\n",
    "\n",
    "node_x3_1 = Builder.DecisionNode(3, left=0, right=node_x4_1)\n",
    "node_x3_2 = Builder.DecisionNode(3, left=node_x4_2, right=node_x4_3)\n",
    "node_x3_3 = Builder.DecisionNode(3, left=node_x4_4, right=node_x4_5)\n",
    "\n",
    "node_x2_1 = Builder.DecisionNode(2, left=0, right=node_x3_1)\n",
    "node_x2_2 = Builder.DecisionNode(2, left=node_x3_2, right=node_x3_3)\n",
    "\n",
    "node_x1_1 = Builder.DecisionNode(1, left=node_x2_1, right=node_x2_2)\n",
    "\n",
    "tree = Builder.DecisionTree(4, node_x1_1, force_features_equal_to_binaries=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bad9b535",
   "metadata": {},
   "source": [
    "And we compute the sufficient reasons for each of these two instances: "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "0f5c98bf",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "sufficient_reasons: ((1, 4), (2, 3, 4))\n",
      "to_features: ['f1 >= 0.5', 'f4 >= 0.5']\n",
      "to_features: ['f2 >= 0.5', 'f3 >= 0.5', 'f4 >= 0.5']\n",
      "[['v', '1', '-2', '-3', '']]\n",
      "minimal_sufficient_reason: (1, 4)\n",
      "-------------------------------\n",
      "sufficient_reasons: ((-4,), (-1, -2), (-1, -3))\n",
      "to_features: ['f4 < 0.5']\n",
      "to_features: ['f1 < 0.5', 'f2 < 0.5']\n",
      "to_features: ['f1 < 0.5', 'f3 < 0.5']\n",
      "[['v', '-1', '2', '-3', '-4', '5', '-6', '-7', '']]\n",
      "minimal_sufficient_reasons: (-4,)\n"
     ]
    }
   ],
   "source": [
    "explainer = Explaining.initialize(tree)\n",
    "explainer.set_instance((1,1,1,1))\n",
    "\n",
    "sufficient_reasons = explainer.sufficient_reason(n=Explaining.ALL)\n",
    "print(\"sufficient_reasons:\", sufficient_reasons)\n",
    "assert sufficient_reasons == ((1, 4), (2, 3, 4)), \"The sufficient reasons are not good !\"\n",
    "\n",
    "for sufficient in sufficient_reasons:\n",
    "  print(\"to_features:\", explainer.to_features(sufficient))  \n",
    "  assert explainer.is_sufficient_reason(sufficient), \"This is have to be a sufficient reason !\"\n",
    "\n",
    "minimals = explainer.minimal_sufficient_reason()\n",
    "print(\"minimal_sufficient_reason:\", minimals)\n",
    "assert minimals == (1, 4), \"The minimal sufficient reasons are not good !\"\n",
    "\n",
    "print(\"-------------------------------\")\n",
    "\n",
    "explainer.set_instance((0,0,0,0))\n",
    "\n",
    "sufficient_reasons = explainer.sufficient_reason(n=Explaining.ALL)\n",
    "print(\"sufficient_reasons:\", sufficient_reasons)\n",
    "assert sufficient_reasons == ((-4,), (-1, -2), (-1, -3)), \"The sufficient reasons are not good !\"\n",
    "\n",
    "for sufficient in sufficient_reasons:\n",
    "  print(\"to_features:\", explainer.to_features(sufficient))\n",
    "  assert explainer.is_sufficient_reason(sufficient), \"This is have to be a sufficient reason !\"\n",
    "\n",
    "minimals = explainer.minimal_sufficient_reason(n=1)\n",
    "print(\"minimal_sufficient_reasons:\", minimals)\n",
    "assert minimals == (-4,), \"The minimal sufficient reasons are not good !\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e0420183",
   "metadata": {},
   "source": [
    "## Example from a Real Dataset"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "03c8f44e",
   "metadata": {},
   "source": [
    "For this example, we take the ```compas.csv``` dataset. We create a model using the hold-out approach (by default, the test size is set to 30%) and select a well-classified instance. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "5a1c9c9b",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "--------------   Information   ---------------\n",
      "Problem type: classification\n",
      "Instances type: tabular\n",
      "Labels type: classes\n",
      "\n",
      "Dataset path: ../../../dataset/compas.csv\n",
      "nFeatures (nAttributes, with the labels): 11\n",
      "nInstances (nObservations): 6172\n",
      "nLabels: 2\n",
      "---------------   Model creation, fitting and evaluation  ---------------\n",
      "Splitting method: hold-out\n",
      "Problem type: classification\n",
      "Models type: decision-tree\n",
      "model_parameters: {}\n",
      "---------   Evaluation Information   ---------\n",
      "For the evaluation number 0:\n",
      "Metrics:\n",
      "   sklearn_confusion_matrix: [[631, 212], [326, 374]]\n",
      "   precision: 63.82252559726962\n",
      "   recall: 53.42857142857142\n",
      "   f1_score: 58.16485225505443\n",
      "   specificity: 74.85172004744959\n",
      "   true_positive: 374\n",
      "   true_negative: 631\n",
      "   false_positive: 212\n",
      "   false_negative: 326\n",
      "   accuracy: 65.13285806869735\n",
      "Number of Training instances: 4629\n",
      "Number of Testing instances: 1543\n",
      "\n",
      "---------------   Explainer   ----------------\n",
      "For the split number 0:\n",
      "**Decision Tree Model**\n",
      "nFeatures: 11\n",
      "nNodes: 574\n",
      "nVariables: 48\n",
      "\n",
      "---------------   Instances   ----------------\n",
      "Number of instances selected: 1\n",
      "----------------------------------------------\n"
     ]
    }
   ],
   "source": [
    "from pyxai import Learning, Explaining\n",
    "\n",
    "learner = Learning.Scikitlearn(\"../../../dataset/compas.csv\", problem_type='classification')\n",
    "model = learner.evaluate(splitting_method=Learning.HOLD_OUT, model_type=Learning.DT)\n",
    "instance, prediction = learner.get_instances(model, n=1, is_correct=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4cacbab0",
   "metadata": {},
   "source": [
    "And we compute a sufficient reason for this instance: "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "b7691f19",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "instance: Misdemeanor             0\n",
      "Number_of_Priors        0\n",
      "score_factor            0\n",
      "Age_Above_FourtyFive    1\n",
      "Age_Below_TwentyFive    0\n",
      "African_American        0\n",
      "Asian                   0\n",
      "Hispanic                0\n",
      "Native_American         0\n",
      "Other                   1\n",
      "Female                  0\n",
      "Name: 0, dtype: int64\n",
      "prediction: 0\n",
      "\n",
      "\n",
      "sufficient reason: 4\n",
      "to features ['Misdemeanor <= 0.5', 'Number_of_Priors <= 0.5', 'score_factor <= 0.5']\n",
      "is sufficient_reason (for max 50 checks):  True\n",
      "\n",
      "[['v', '1', '-2', '-3', '4', '5', '-6', '-7', '-8', '-9', '-10', '-11', '-12', '-13', '-14', '15', '-16', '-17', '-18', '-19', '-20', '21', '22', '23', '24', '25', '-26', '-27', '-28', '-29', '-30', '-31', '-32', '-33', '-34', '-35', '-36', '-37', '-38', '-39', '']]\n",
      "\n",
      "minimal: 4\n",
      "is sufficient_reason (for max 50 checks):  True\n",
      "\n",
      "\n",
      "necessary literals:  [-1]\n",
      "\n",
      "necessary literals features:  ['score_factor <= 0.5']\n",
      "\n",
      "relevant literals:  [-5, -6, -3, -11, -2, 4, -18, -13, 7, -8, -9, -12, -15, -31]\n",
      "\n",
      "n sufficient reasons: 15\n",
      "\n",
      "sufficient_reasons_per_attribute: {-1: 15, -5: 8, -6: 7, -3: 12, -11: 5, -2: 5, 4: 10, -18: 10, -13: 10, 7: 5, -8: 7, -9: 5, -12: 4, -15: 1, -31: 1}\n",
      "\n",
      "sufficient_reasons_per_attribute features: OrderedDict({'Misdemeanor': [{'id': 1, 'name': 'Misdemeanor', 'operator_sign_considered': <OperatorCondition.LE: 'LE'>, 'threshold': np.float64(0.5), 'weight': 5, 'string': 'Misdemeanor <= 0.5'}], 'Number_of_Priors': [{'id': 2, 'name': 'Number_of_Priors', 'operator_sign_considered': <OperatorCondition.LE: 'LE'>, 'threshold': np.float64(0.5), 'weight': 8, 'string': 'Number_of_Priors <= 0.5'}], 'score_factor': [{'id': 3, 'name': 'score_factor', 'operator_sign_considered': <OperatorCondition.LE: 'LE'>, 'threshold': np.float64(0.5), 'weight': 15, 'string': 'score_factor <= 0.5'}], 'Age_Above_FourtyFive': [{'id': 4, 'name': 'Age_Above_FourtyFive', 'operator_sign_considered': <OperatorCondition.GT: 'GT'>, 'threshold': np.float64(0.5), 'weight': 10, 'string': 'Age_Above_FourtyFive > 0.5'}], 'Age_Below_TwentyFive': [{'id': 5, 'name': 'Age_Below_TwentyFive', 'operator_sign_considered': <OperatorCondition.LE: 'LE'>, 'threshold': np.float64(0.5), 'weight': 12, 'string': 'Age_Below_TwentyFive <= 0.5'}], 'African_American': [{'id': 6, 'name': 'African_American', 'operator_sign_considered': <OperatorCondition.LE: 'LE'>, 'threshold': np.float64(0.5), 'weight': 4, 'string': 'African_American <= 0.5'}], 'Asian': [{'id': 7, 'name': 'Asian', 'operator_sign_considered': <OperatorCondition.LE: 'LE'>, 'threshold': np.float64(0.5), 'weight': 7, 'string': 'Asian <= 0.5'}], 'Hispanic': [{'id': 8, 'name': 'Hispanic', 'operator_sign_considered': <OperatorCondition.LE: 'LE'>, 'threshold': np.float64(0.5), 'weight': 7, 'string': 'Hispanic <= 0.5'}], 'Other': [{'id': 10, 'name': 'Other', 'operator_sign_considered': <OperatorCondition.GT: 'GT'>, 'threshold': np.float64(0.5), 'weight': 5, 'string': 'Other > 0.5'}], 'Female': [{'id': 11, 'name': 'Female', 'operator_sign_considered': <OperatorCondition.LE: 'LE'>, 'threshold': np.float64(0.5), 'weight': 5, 'string': 'Female <= 0.5'}]})\n"
     ]
    }
   ],
   "source": [
    "explainer = Explaining.initialize(model, instance)\n",
    "print(\"instance:\", instance)\n",
    "print(\"prediction:\", prediction)\n",
    "print()\n",
    "sufficient_reason = explainer.sufficient_reason(n=1)\n",
    "#for s in sufficient_reasons:\n",
    "print(\"\\nsufficient reason:\", len(sufficient_reason))\n",
    "print(\"to features\", explainer.to_features(sufficient_reason))\n",
    "print(\"is sufficient_reason (for max 50 checks): \", explainer.is_sufficient_reason(sufficient_reason, n_samples=50))\n",
    "print()\n",
    "minimal = explainer.minimal_sufficient_reason()\n",
    "print(\"\\nminimal:\", len(minimal))\n",
    "print(\"is sufficient_reason (for max 50 checks): \", explainer.is_sufficient_reason(sufficient_reason, n_samples=50))\n",
    "print()\n",
    "print(\"\\nnecessary literals: \", explainer.necessary_literals())\n",
    "print(\"\\nnecessary literals features: \", explainer.to_features(explainer.necessary_literals()))\n",
    "print(\"\\nrelevant literals: \", explainer.relevant_literals())\n",
    "print()\n",
    "print(\"n sufficient reasons:\", explainer.n_sufficient_reasons())\n",
    "sufficient_reasons_per_attribute = explainer.n_sufficient_reasons_per_attribute()\n",
    "print(\"\\nsufficient_reasons_per_attribute:\", sufficient_reasons_per_attribute)\n",
    "print(\"\\nsufficient_reasons_per_attribute features:\", explainer.to_features(sufficient_reasons_per_attribute, details=True))\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "14fa4d1d",
   "metadata": {},
   "source": [
    "Other types of explanations are presented in the [Explanations Computation](/documentation/explanations/DTexplanations/) page."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "552c7f56-8c63-4f41-8b3a-dd59034cbc51",
   "metadata": {},
   "source": [
    "## See Also\n",
    " - API: ```Builder```, ```ExplainerDT```, ```Learner```."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.7"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}