{ "cells": [ { "cell_type": "markdown", "id": "5d829e16", "metadata": {}, "source": [ "## Tree-Specific Explanations" ] }, { "cell_type": "markdown", "id": "2aa7c930", "metadata": {}, "source": [ "Let $BT$ be a Boosted Tree composed of {$T_1,\\ldots, T_n$} regression trees and $x$ be an instance.\n", "* A **worst instance** extending $t$ given $BT$ is an instance $x'$ such that $t \\subseteq t_{x'}$ and: $$x' = \\mathit{argmin}_{x'': t \\subseteq t_{x''}}(\\{w(BT, x'')\\})$$\n", "* A **best instance** extending $t$ given $BT$ is an instance $x'$ such that $t \\subseteq t_{x'}$ and: $$x' = \\mathit{argmax}_{x'': t \\subseteq t_{x''}}(\\{w(BT, x'')\\})$$\n", "\n", "$W(t, BT)$ (resp. $B(t, BT)$) denotes the set of worst (resp. best) instances covered by $t$ given $BT$, and $w_\\downarrow(t, BT)$ (resp. $w_\\uparrow(t, BT)$) denotes the weight of the worst (resp. best) instance covered by $t$ given $F$." ] }, { "cell_type": "markdown", "id": "f48c34f8", "metadata": {}, "source": [ "In the multi-class setting, TS-explanations are defined on page dedicated to the [Tree-Specific Explanations for Classification](/documentation/classification/BTexplanations/treespecific/).\n", "\n", "For the regression case, TS-explanations can be defined as follows:\n", "\n", "Let $BT$ be a Boosted Tree composed of {$T_1,\\ldots, T_n$} regression trees and $x$ be an instance such that $BT(x) = r$. Let $I=[a,b]$ be an interval containing $r$. \n", "\n", "{: .note}\n", "> In order to get explanations that best respond to real problems, we do not explain why $BT(x) = r$ but why $BT(x) \\in [a,b]$. Note that explaining why $BT(x) = r$ corresponds to computing the [direct reason](/documentation/regression/BTregression/direct/) for $x$. \n", "\n", "\n", "Conceptually, $t$ is a **tree-specific** explanation for $x$ given $BT$ and be an interval $I=[a,b]$ if and only if $t$ is a subset of $t_{x}$ such that:\n", "- $w_\\downarrow(t, BT) \\geq a$ \n", "- $w_\\uparrow(t, BT) \\leq b$ \n", "- no proper subset of $t$ satisfies the latter condition.\n" ] }, { "cell_type": "markdown", "id": "424b7b69", "metadata": {}, "source": [ "In general, the notions of tree-specific explanation and of sufficient reason do not coincide. Indeed, a sufficient reason is a prime implicant (covering $x$) of the Boosted Tree $BT$, while a tree-specific explanation is an implicant $t$ (covering $x$). Since there is a simple, linear-time algorithm for computing $w_\\downarrow(t, F)$ and $w_\\uparrow(t, F)$ and for deriving a worst-case and a best-case instance, the **tree-specific** explanations for $x$ are much easier to compute than the sufficient reasons and they remain abductive.\n", "\n", "More information about tree-specific explanations for the classification task can be found in the paper [Computing Abductive Explanations for Boosted Trees](https://arxiv.org/abs/2209.07740). For the regression task, more information can be found in the [Computing Abductive Explanations for Boosted Regression Trees](https://www.ijcai.org/proceedings/2023/382) paper." ] }, { "cell_type": "markdown", "id": "a6a70125", "metadata": {}, "source": [ "| <ExplainerRegressionBT Object>.tree_specific_reason(*, n_iterations=50, time_limit=None, seed=0): | \n", "| :----------- | \n", "|This method calls a greedy algorithm to compute a tree-specific explanation. The algorithm runs n_iterations times and returns the smallest tree-specific explanation that has been computed.

Excluded features are supported. The reasons are in the form of binary variables, you must use the ```to_features``` method if you want to obtain a representation based on the features represented at start. |\n", "| time_limit ```Integer``` ```None```: The time limit of the method in seconds. Sets this to ```None``` to give this process an infinite amount of time. Default value is ```None```.|\n", "| n_iterations ```Integer```: The number of iterations done by the greedy algorithm. Default value is 50.|\n", "| seed ```Integer```: The seed when the greedy algorithm is used. Set to 0 this parameter in order to use a random seed. Default value is 0|" ] }, { "cell_type": "markdown", "id": "acf73429", "metadata": {}, "source": [ "To use the ```tree_specific_reason``` method, you must first call the method ```set_interval``` to define an interval." ] }, { "cell_type": "markdown", "id": "d0db7896", "metadata": {}, "source": [ "| <ExplainerRegressionBT Object>.set_interval(lower_bound, upper_bound): |\n", "|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| \n", "| Set the interval used to compute the explanation. |\n", "| lower_bound ```Float```: the lower bound of the interval. |\n", "| upper_bound ```Float```: the upper bound of the interval. |\n" ] }, { "cell_type": "markdown", "id": "d2abdffd", "metadata": {}, "source": [ "We can check a tree-specific explnation using: \n", "\n", "| <ExplainerRegressionBT Object>.is_tree_specific_reason(reason): | \n", "| :----------- | \n", "| This method checks whether an explanation is a tree-specific one. To do it, we first call the method ```is_implicant``` to check whether this explanation leads to the correct prediction or not. |\n", "| reason ```List``` of ```Integer```: The explanation to be checked.|" ] }, { "cell_type": "markdown", "id": "dc483010", "metadata": {}, "source": [ "## Example from Hand-Crafted Trees" ] }, { "cell_type": "markdown", "id": "6f86e05b", "metadata": {}, "source": [ "To illustrate the generation of a tree-specific explanation, we take an example from the [Computing Abductive Explanations for Boosted Regression Trees](https://www.ijcai.org/proceedings/2023/382) paper.\n", "\n", "Let us consider a loan application scenario. The goal is to predict\n", "the amount of money that can be granted to an applicant described using three attributes ($A = \\{A_1, A_2, A_3\\}$). \n", "- $A_1$ is a numerical attribute giving the income per month of the applicant\n", "- $A_2$ is a categorical feature giving its employment status as ”employed”, ”unemployed” or ”self-employed”\n", "- $A_3$ is a Boolean feature set to true if the customer is married, false otherwise. \n", "\n", "\"BTbase\"\n", "\n", "In this example:\n", "\n", "- $A_1$ is represented by the feature identifier $F_1$\n", "- $A_2$ has been one-hot encoded and is represented by feature identifiers $F_2$, $F_3$ and $F_4$, each of these features represents respectively the conditions $A_2 = employed$, $A_2 = unemployed$ and $A_2 = self-employed$\n", "- $A_3$ is represented by the feature identifier $F_5$ and the condition $(A_3 = 1)$ (”the applicant is married”)\n", "\n", "We consider the instance $x=(2200, 0, 0, 1, 1)$, corresponding to a person with a salary equal to 2200 per month, self-employed (one-hot encoded) and married. Then, $BT(x) = 1500 + 250 + 250 = 2000\\$.\n", "\n", "A tree-specific explanation for the instance $x = (2200, 0, 0, 1, 1)$ and $I=[1500, 2500]$ can be represented by $t = \\{A_1{>}2000\\}$. Indeed, we have $w_\\downarrow(t, BT) \\geq 1500$ and $w_\\uparrow(t, BT) \\leq 2500$. \n", "\n", "The next figure represents the tree-specific reason $t = \\{A_1{>}2000\\}$ in red and the dark red leaves give the weights of $w_\\downarrow(t, BT)$ for each regression tree of $BT$. We can see that $w_\\downarrow(t, BT) = 1500 - 100 + 250 = 1650 \\geq 1500$. \n", "\n", "\"BTTS1\"\n", "\n", "The next figure represents the tree-specific explanation $t = \\{A_1{>}2000\\}$ in blue and the dark blue leaves give the weights of $w_\\uparrow(t, BT)$ for each regression tree of $BT$. We can observe that $w_\\uparrow(t, BT) = 1750 + 250 + 100 = 2100 \\leq 2500$. \n", "\n", "\"BTTS2\"\n", "\n", "\n", "We now show how to get those tree-specific explanations using PyXAI: " ] }, { "cell_type": "code", "execution_count": 1, "id": "3667447b", "metadata": {}, "outputs": [], "source": [ "from pyxai import Builder, Explainer, Learning\n", "\n", "node1_1 = Builder.DecisionNode(1, operator=Builder.GT, threshold=3000, left=1500, right=1750)\n", "node1_2 = Builder.DecisionNode(1, operator=Builder.GT, threshold=2000, left=1000, right=node1_1)\n", "node1_3 = Builder.DecisionNode(1, operator=Builder.GT, threshold=1000, left=0, right=node1_2)\n", "tree1 = Builder.DecisionTree(5, node1_3)\n", "\n", "\n", "node2_1 = Builder.DecisionNode(5, operator=Builder.EQ, threshold=1, left=100, right=250)\n", "node2_2 = Builder.DecisionNode(4, operator=Builder.EQ, threshold=1, left=-100, right=node2_1)\n", "node2_3 = Builder.DecisionNode(2, operator=Builder.EQ, threshold=1, left=node2_2, right=250)\n", "tree2 = Builder.DecisionTree(5, node2_3)\n", "\n", "node3_1 = Builder.DecisionNode(3, operator=Builder.EQ, threshold=1, left=500, right=250)\n", "node3_2 = Builder.DecisionNode(3, operator=Builder.EQ, threshold=1, left=250, right=100)\n", "node3_3 = Builder.DecisionNode(1, operator=Builder.GT, threshold=2000, left=0, right=node3_1)\n", "node3_4 = Builder.DecisionNode(4, operator=Builder.EQ, threshold=1, left=node3_3, right=node3_2)\n", "tree3 = Builder.DecisionTree(5, node3_4)\n", "\n", "\n", "BTs = Builder.BoostedTreesRegression([tree1, tree2, tree3])" ] }, { "cell_type": "markdown", "id": "9e07d5a6", "metadata": {}, "source": [ "This example can be found in the second part of the [Building Boosted Trees](/documentation/learning/builder/BTbuilder/) page. " ] }, { "cell_type": "code", "execution_count": 2, "id": "84b411c1", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "instance: (2200, 0, 0, 1, 1)\n", "--------- Theory Feature Types -----------\n", "Before the encoding (without one hot encoded features), we have:\n", "Numerical features: 1\n", "Categorical features: 1\n", "Binary features: 1\n", "Number of features: 3\n", "Values of categorical features: {'f2': ['f{2,3,4}', 1, (1, 2, 3)], 'f3': ['f{2,3,4}', 2, (1, 2, 3)], 'f4': ['f{2,3,4}', 3, (1, 2, 3)]}\n", "\n", "Number of used features in the model (before the encoding): 3\n", "Number of used features in the model (after the encoding): 5\n", "----------------------------------------------\n", "prediction: 2000\n", "tree specific: ('f1 > 2000',)\n", "is tree : True\n" ] } ], "source": [ "instance = (2200, 0, 0, 1, 1) # 2200$, self employed (one hot encoded), married\n", "print(\"instance:\", instance)\n", "\n", "loan_types = {\n", " \"numerical\": Learning.DEFAULT,\n", " \"categorical\": {\"f{2,3,4}\": (1, 2, 3)},\n", " \"binary\": [\"f5\"],\n", "}\n", "\n", "explainer = Explainer.initialize(BTs, instance, features_type=loan_types)\n", "\n", "print(\"prediction:\", explainer.predict(instance))\n", "explainer.set_interval(1500, 2500)\n", "\n", "tree_specific = explainer.tree_specific_reason()\n", "print(\"tree specific:\", explainer.to_features(tree_specific))\n", "print(\"is tree : \", explainer.is_tree_specific_reason(tree_specific))" ] }, { "cell_type": "markdown", "id": "ac340495", "metadata": {}, "source": [ "## Example from a Real Dataset" ] }, { "cell_type": "markdown", "id": "73754a55", "metadata": {}, "source": [ "For this example, we take the [Houses-prices](https://www.kaggle.com/competitions/house-prices-advanced-regression-techniques) dataset (this one [here](/assets/notebooks/dataset/houses-prices.csv)). We create a model using the hold-out approach (by default, the test size is set to 30%) and select a well-classified instance. As this dataset contains strings, we encode the data using PyXAI's [Preprocessor](/documentation/preprocessor/): " ] }, { "cell_type": "code", "execution_count": 3, "id": "a0cf26af", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Index(['Id', 'MSSubClass', 'LotArea', 'Street', 'LotShape', 'LandContour',\n", " 'LotConfig', 'LandSlope', 'Neighborhood', 'Condition1', 'Condition2',\n", " 'BldgType', 'HouseStyle', 'OverallQual', 'OverallCond', 'YearBuilt',\n", " 'YearRemodAdd', 'RoofStyle', 'RoofMatl', 'ExterQual', 'ExterCond',\n", " 'Foundation', 'Heating', 'HeatingQC', 'CentralAir', '1stFlrSF',\n", " '2ndFlrSF', 'LowQualFinSF', 'GrLivArea', 'FullBath', 'HalfBath',\n", " 'BedroomAbvGr', 'KitchenAbvGr', 'TotRmsAbvGrd', 'Fireplaces',\n", " 'PavedDrive', 'WoodDeckSF', 'OpenPorchSF', 'EnclosedPorch', '3SsnPorch',\n", " 'ScreenPorch', 'PoolArea', 'MiscVal', 'MoSold', 'YrSold',\n", " 'SaleCondition', 'SalePrice'],\n", " dtype='object')\n", "--------------- Converter ---------------\n", "Feature deleted: Id\n", "One hot encoding new features for MSSubClass: 16\n", "-> The feature Street is boolean! No One Hot Encoding for this features.\n", "-> However, the boolean feature Street contains strings. A ordinal encoding must be performed.\n", "One hot encoding new features for LotShape: 4\n", "One hot encoding new features for LandContour: 4\n", "One hot encoding new features for LotConfig: 5\n", "One hot encoding new features for LandSlope: 3\n", "One hot encoding new features for Neighborhood: 25\n", "One hot encoding new features for Condition1: 9\n", "One hot encoding new features for Condition2: 8\n", "One hot encoding new features for BldgType: 5\n", "One hot encoding new features for HouseStyle: 8\n", "One hot encoding new features for OverallQual: 10\n", "One hot encoding new features for OverallCond: 9\n", "One hot encoding new features for RoofStyle: 6\n", "One hot encoding new features for RoofMatl: 8\n", "One hot encoding new features for ExterQual: 4\n", "One hot encoding new features for ExterCond: 5\n", "One hot encoding new features for Foundation: 6\n", "One hot encoding new features for Heating: 6\n", "One hot encoding new features for HeatingQC: 5\n", "-> The feature CentralAir is boolean! No One Hot Encoding for this features.\n", "-> However, the boolean feature CentralAir contains strings. A ordinal encoding must be performed.\n", "One hot encoding new features for PavedDrive: 3\n", "One hot encoding new features for SaleCondition: 6\n", "Dataset saved: ../../dataset/houses-prices-converted.csv\n", "Types saved: ../../dataset/houses-prices-converted.types\n", "-----------------------------------------------\n" ] } ], "source": [ "from pyxai import Learning\n", "\n", "preprocessor = Learning.Preprocessor(\"../../dataset/houses-prices.csv\", target_feature=\"SalePrice\", learner_type=Learning.REGRESSION)\n", "\n", "preprocessor.unset_features([\"Id\"])\n", "\n", "preprocessor.set_categorical_features(columns=[\n", " \"MSSubClass\",\n", " \"Street\",\n", " \"LotShape\", \n", " \"LandContour\", \n", " \"LotConfig\", \n", " \"LandSlope\", \n", " \"Neighborhood\", \n", " \"Condition1\", \n", " \"Condition2\", \n", " \"BldgType\", \n", " \"HouseStyle\", \n", " \"OverallQual\", \n", " \"OverallCond\", \n", " \"RoofStyle\", \n", " \"RoofMatl\", \n", " \"ExterQual\", \n", " \"ExterCond\", \n", " \"Foundation\", \n", " \"Heating\", \n", " \"HeatingQC\", \n", " \"CentralAir\", \n", " \"PavedDrive\", \n", " \"SaleCondition\"])\n", "\n", "preprocessor.set_numerical_features({\n", " \"LotArea\": None,\n", " \"YearBuilt\": None, \n", " \"YearRemodAdd\": None, \n", " \"1stFlrSF\": None,\n", " \"2ndFlrSF\": None,\n", " \"LowQualFinSF\": None,\n", " \"GrLivArea\": None,\n", " \"FullBath\": None,\n", " \"HalfBath\": None,\n", " \"BedroomAbvGr\": None,\n", " \"KitchenAbvGr\": None,\n", " \"TotRmsAbvGrd\": None,\n", " \"Fireplaces\": None,\n", " \"WoodDeckSF\": None,\n", " \"OpenPorchSF\": None,\n", " \"EnclosedPorch\": None,\n", " \"3SsnPorch\": None,\n", " \"ScreenPorch\": None,\n", " \"PoolArea\": None,\n", " \"MiscVal\": None,\n", " \"MoSold\": None,\n", " \"YrSold\": None\n", " })\n", "\n", "\n", "preprocessor.process()\n", "dataset_name = \"../../dataset/houses-prices.csv\".split(\"/\")[-1].split(\".\")[0]+\"-converted\" \n", "preprocessor.export(dataset_name, output_directory=\"../../dataset\")" ] }, { "cell_type": "markdown", "id": "3355045d", "metadata": {}, "source": [ "```console\n", "Index(['Id', 'MSSubClass', 'LotArea', 'Street', 'LotShape', 'LandContour',\n", " 'LotConfig', 'LandSlope', 'Neighborhood', 'Condition1', 'Condition2',\n", " 'BldgType', 'HouseStyle', 'OverallQual', 'OverallCond', 'YearBuilt',\n", " 'YearRemodAdd', 'RoofStyle', 'RoofMatl', 'ExterQual', 'ExterCond',\n", " 'Foundation', 'Heating', 'HeatingQC', 'CentralAir', '1stFlrSF',\n", " '2ndFlrSF', 'LowQualFinSF', 'GrLivArea', 'FullBath', 'HalfBath',\n", " 'BedroomAbvGr', 'KitchenAbvGr', 'TotRmsAbvGrd', 'Fireplaces',\n", " 'PavedDrive', 'WoodDeckSF', 'OpenPorchSF', 'EnclosedPorch', '3SsnPorch',\n", " 'ScreenPorch', 'PoolArea', 'MiscVal', 'MoSold', 'YrSold',\n", " 'SaleCondition', 'SalePrice'],\n", " dtype='object')\n", "--------------- Converter ---------------\n", "Feature deleted: Id\n", "One hot encoding new features for MSSubClass: 16\n", "-> The feature Street is boolean! No One Hot Encoding for this features.\n", "-> However, the boolean feature Street contains strings. A ordinal encoding must be performed.\n", "One hot encoding new features for LotShape: 4\n", "One hot encoding new features for LandContour: 4\n", "One hot encoding new features for LotConfig: 5\n", "One hot encoding new features for LandSlope: 3\n", "One hot encoding new features for Neighborhood: 25\n", "One hot encoding new features for Condition1: 9\n", "One hot encoding new features for Condition2: 8\n", "One hot encoding new features for BldgType: 5\n", "One hot encoding new features for HouseStyle: 8\n", "One hot encoding new features for OverallQual: 10\n", "One hot encoding new features for OverallCond: 9\n", "One hot encoding new features for RoofStyle: 6\n", "One hot encoding new features for RoofMatl: 8\n", "One hot encoding new features for ExterQual: 4\n", "One hot encoding new features for ExterCond: 5\n", "One hot encoding new features for Foundation: 6\n", "One hot encoding new features for Heating: 6\n", "One hot encoding new features for HeatingQC: 5\n", "-> The feature CentralAir is boolean! No One Hot Encoding for this features.\n", "-> However, the boolean feature CentralAir contains strings. A ordinal encoding must be performed.\n", "One hot encoding new features for PavedDrive: 3\n", "One hot encoding new features for SaleCondition: 6\n", "Dataset saved: ../../dataset/houses-prices-converted_0.csv\n", "Types saved: ../../dataset/houses-prices-converted_0.types\n", "-----------------------------------------------\n", "```" ] }, { "cell_type": "markdown", "id": "95d7b88c", "metadata": {}, "source": [ "Now we produce a model and pick up an instance: " ] }, { "cell_type": "code", "execution_count": 4, "id": "a1479f1b", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "data:\n", " MSSubClass_20 MSSubClass_30 MSSubClass_40 MSSubClass_45 \\\n", "0 0 0 0 0 \n", "1 1 0 0 0 \n", "2 0 0 0 0 \n", "3 0 0 0 0 \n", "4 0 0 0 0 \n", "... ... ... ... ... \n", "2914 0 0 0 0 \n", "2915 0 0 0 0 \n", "2916 1 0 0 0 \n", "2917 0 0 0 0 \n", "2918 0 0 0 0 \n", "\n", " MSSubClass_50 MSSubClass_60 MSSubClass_70 MSSubClass_75 \\\n", "0 0 1 0 0 \n", "1 0 0 0 0 \n", "2 0 1 0 0 \n", "3 0 0 1 0 \n", "4 0 1 0 0 \n", "... ... ... ... ... \n", "2914 0 0 0 0 \n", "2915 0 0 0 0 \n", "2916 0 0 0 0 \n", "2917 0 0 0 0 \n", "2918 0 1 0 0 \n", "\n", " MSSubClass_80 MSSubClass_85 ... MiscVal MoSold YrSold \\\n", "0 0 0 ... 0 2 2008 \n", "1 0 0 ... 0 5 2007 \n", "2 0 0 ... 0 9 2008 \n", "3 0 0 ... 0 2 2006 \n", "4 0 0 ... 0 12 2008 \n", "... ... ... ... ... ... ... \n", "2914 0 0 ... 0 6 2006 \n", "2915 0 0 ... 0 4 2006 \n", "2916 0 0 ... 0 9 2006 \n", "2917 0 1 ... 700 7 2006 \n", "2918 0 0 ... 0 11 2006 \n", "\n", " SaleCondition_Abnorml SaleCondition_AdjLand SaleCondition_Alloca \\\n", "0 0 0 0 \n", "1 0 0 0 \n", "2 0 0 0 \n", "3 1 0 0 \n", "4 0 0 0 \n", "... ... ... ... \n", "2914 0 0 0 \n", "2915 1 0 0 \n", "2916 1 0 0 \n", "2917 0 0 0 \n", "2918 0 0 0 \n", "\n", " SaleCondition_Family SaleCondition_Normal SaleCondition_Partial \\\n", "0 0 1 0 \n", "1 0 1 0 \n", "2 0 1 0 \n", "3 0 0 0 \n", "4 0 1 0 \n", "... ... ... ... \n", "2914 0 1 0 \n", "2915 0 0 0 \n", "2916 0 0 0 \n", "2917 0 1 0 \n", "2918 0 1 0 \n", "\n", " SalePrice \n", "0 208500.000000 \n", "1 181500.000000 \n", "2 223500.000000 \n", "3 140000.000000 \n", "4 250000.000000 \n", "... ... \n", "2914 167081.220949 \n", "2915 164788.778231 \n", "2916 219222.423400 \n", "2917 184924.279659 \n", "2918 187741.866657 \n", "\n", "[2919 rows x 180 columns]\n", "-------------- Information ---------------\n", "Dataset name: ../../dataset/houses-prices-converted_0.csv\n", "nFeatures (nAttributes, with the labels): 180\n", "nInstances (nObservations): 2919\n", "nLabels: None\n", "--------------- Evaluation ---------------\n", "method: HoldOut\n", "output: BT\n", "learner_type: Regression\n", "learner_options: {'seed': 0, 'max_depth': None}\n", "--------- Evaluation Information ---------\n", "For the evaluation number 0:\n", "metrics:\n", " mean_squared_error: 1997310553.8387074\n", " root_mean_squared_error: 44691.28051240765\n", " mean_absolute_error: 29588.51328599622\n", "nTraining instances: 2043\n", "nTest instances: 876\n", "\n", "--------------- Explainer ----------------\n", "For the evaluation number 0:\n", "**Boosted Tree model**\n", "NClasses: None\n", "nTrees: 100\n", "nVariables: 1696\n", "\n", "--------------- Instances ----------------\n", "number of instances selected: 1\n", "----------------------------------------------\n" ] } ], "source": [ "from pyxai import Learning, Explainer\n", "\n", "learner = Learning.Xgboost(\"../../dataset/houses-prices-converted_0.csv\", learner_type=Learning.REGRESSION)\n", "model = learner.evaluate(method=Learning.HOLD_OUT, output=Learning.BT)\n", "instance, prediction = learner.get_instances(model, n=1)" ] }, { "cell_type": "markdown", "id": "4d0e2ec8", "metadata": {}, "source": [ "Finally, we display a tree-specific explanation for this instance. Note that we activate the theory created by the PyXAI's Preprocessor by adding the parameter ```features_type=\"../../dataset/houses-prices-converted_0.types\"``` to the ```initialize``` method. More information about theories are available on this [page](/documentation/explainer/theories/)." ] }, { "cell_type": "code", "execution_count": 5, "id": "29a4effc", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "--------- Theory Feature Types -----------\n", "Before the encoding (without one hot encoded features), we have:\n", "Numerical features: 22\n", "Categorical features: 21\n", "Binary features: 2\n", "Number of features: 45\n", "Values of categorical features: {'MSSubClass_20': ['MSSubClass', 20, [20, 30, 40, 45, 50, 60, 70, 75, 80, 85, 90, 120, 150, 160, 180, 190]], 'MSSubClass_30': ['MSSubClass', 30, [20, 30, 40, 45, 50, 60, 70, 75, 80, 85, 90, 120, 150, 160, 180, 190]], 'MSSubClass_40': ['MSSubClass', 40, [20, 30, 40, 45, 50, 60, 70, 75, 80, 85, 90, 120, 150, 160, 180, 190]], 'MSSubClass_45': ['MSSubClass', 45, [20, 30, 40, 45, 50, 60, 70, 75, 80, 85, 90, 120, 150, 160, 180, 190]], 'MSSubClass_50': ['MSSubClass', 50, [20, 30, 40, 45, 50, 60, 70, 75, 80, 85, 90, 120, 150, 160, 180, 190]], 'MSSubClass_60': ['MSSubClass', 60, [20, 30, 40, 45, 50, 60, 70, 75, 80, 85, 90, 120, 150, 160, 180, 190]], 'MSSubClass_70': ['MSSubClass', 70, [20, 30, 40, 45, 50, 60, 70, 75, 80, 85, 90, 120, 150, 160, 180, 190]], 'MSSubClass_75': ['MSSubClass', 75, [20, 30, 40, 45, 50, 60, 70, 75, 80, 85, 90, 120, 150, 160, 180, 190]], 'MSSubClass_80': ['MSSubClass', 80, [20, 30, 40, 45, 50, 60, 70, 75, 80, 85, 90, 120, 150, 160, 180, 190]], 'MSSubClass_85': ['MSSubClass', 85, [20, 30, 40, 45, 50, 60, 70, 75, 80, 85, 90, 120, 150, 160, 180, 190]], 'MSSubClass_90': ['MSSubClass', 90, [20, 30, 40, 45, 50, 60, 70, 75, 80, 85, 90, 120, 150, 160, 180, 190]], 'MSSubClass_120': ['MSSubClass', 120, [20, 30, 40, 45, 50, 60, 70, 75, 80, 85, 90, 120, 150, 160, 180, 190]], 'MSSubClass_150': ['MSSubClass', 150, [20, 30, 40, 45, 50, 60, 70, 75, 80, 85, 90, 120, 150, 160, 180, 190]], 'MSSubClass_160': ['MSSubClass', 160, [20, 30, 40, 45, 50, 60, 70, 75, 80, 85, 90, 120, 150, 160, 180, 190]], 'MSSubClass_180': ['MSSubClass', 180, [20, 30, 40, 45, 50, 60, 70, 75, 80, 85, 90, 120, 150, 160, 180, 190]], 'MSSubClass_190': ['MSSubClass', 190, [20, 30, 40, 45, 50, 60, 70, 75, 80, 85, 90, 120, 150, 160, 180, 190]], 'LotShape_IR1': ['LotShape', 'IR1', ['IR1', 'IR2', 'IR3', 'Reg']], 'LotShape_IR2': ['LotShape', 'IR2', ['IR1', 'IR2', 'IR3', 'Reg']], 'LotShape_IR3': ['LotShape', 'IR3', ['IR1', 'IR2', 'IR3', 'Reg']], 'LotShape_Reg': ['LotShape', 'Reg', ['IR1', 'IR2', 'IR3', 'Reg']], 'LandContour_Bnk': ['LandContour', 'Bnk', ['Bnk', 'HLS', 'Low', 'Lvl']], 'LandContour_HLS': ['LandContour', 'HLS', ['Bnk', 'HLS', 'Low', 'Lvl']], 'LandContour_Low': ['LandContour', 'Low', ['Bnk', 'HLS', 'Low', 'Lvl']], 'LandContour_Lvl': ['LandContour', 'Lvl', ['Bnk', 'HLS', 'Low', 'Lvl']], 'LotConfig_Corner': ['LotConfig', 'Corner', ['Corner', 'CulDSac', 'FR2', 'FR3', 'Inside']], 'LotConfig_CulDSac': ['LotConfig', 'CulDSac', ['Corner', 'CulDSac', 'FR2', 'FR3', 'Inside']], 'LotConfig_FR2': ['LotConfig', 'FR2', ['Corner', 'CulDSac', 'FR2', 'FR3', 'Inside']], 'LotConfig_FR3': ['LotConfig', 'FR3', ['Corner', 'CulDSac', 'FR2', 'FR3', 'Inside']], 'LotConfig_Inside': ['LotConfig', 'Inside', ['Corner', 'CulDSac', 'FR2', 'FR3', 'Inside']], 'LandSlope_Gtl': ['LandSlope', 'Gtl', ['Gtl', 'Mod', 'Sev']], 'LandSlope_Mod': ['LandSlope', 'Mod', ['Gtl', 'Mod', 'Sev']], 'LandSlope_Sev': ['LandSlope', 'Sev', ['Gtl', 'Mod', 'Sev']], 'Neighborhood_Blmngtn': ['Neighborhood', 'Blmngtn', ['Blmngtn', 'Blueste', 'BrDale', 'BrkSide', 'ClearCr', 'CollgCr', 'Crawfor', 'Edwards', 'Gilbert', 'IDOTRR', 'MeadowV', 'Mitchel', 'NAmes', 'NPkVill', 'NWAmes', 'NoRidge', 'NridgHt', 'OldTown', 'SWISU', 'Sawyer', 'SawyerW', 'Somerst', 'StoneBr', 'Timber', 'Veenker']], 'Neighborhood_Blueste': ['Neighborhood', 'Blueste', ['Blmngtn', 'Blueste', 'BrDale', 'BrkSide', 'ClearCr', 'CollgCr', 'Crawfor', 'Edwards', 'Gilbert', 'IDOTRR', 'MeadowV', 'Mitchel', 'NAmes', 'NPkVill', 'NWAmes', 'NoRidge', 'NridgHt', 'OldTown', 'SWISU', 'Sawyer', 'SawyerW', 'Somerst', 'StoneBr', 'Timber', 'Veenker']], 'Neighborhood_BrDale': ['Neighborhood', 'BrDale', ['Blmngtn', 'Blueste', 'BrDale', 'BrkSide', 'ClearCr', 'CollgCr', 'Crawfor', 'Edwards', 'Gilbert', 'IDOTRR', 'MeadowV', 'Mitchel', 'NAmes', 'NPkVill', 'NWAmes', 'NoRidge', 'NridgHt', 'OldTown', 'SWISU', 'Sawyer', 'SawyerW', 'Somerst', 'StoneBr', 'Timber', 'Veenker']], 'Neighborhood_BrkSide': ['Neighborhood', 'BrkSide', ['Blmngtn', 'Blueste', 'BrDale', 'BrkSide', 'ClearCr', 'CollgCr', 'Crawfor', 'Edwards', 'Gilbert', 'IDOTRR', 'MeadowV', 'Mitchel', 'NAmes', 'NPkVill', 'NWAmes', 'NoRidge', 'NridgHt', 'OldTown', 'SWISU', 'Sawyer', 'SawyerW', 'Somerst', 'StoneBr', 'Timber', 'Veenker']], 'Neighborhood_ClearCr': ['Neighborhood', 'ClearCr', ['Blmngtn', 'Blueste', 'BrDale', 'BrkSide', 'ClearCr', 'CollgCr', 'Crawfor', 'Edwards', 'Gilbert', 'IDOTRR', 'MeadowV', 'Mitchel', 'NAmes', 'NPkVill', 'NWAmes', 'NoRidge', 'NridgHt', 'OldTown', 'SWISU', 'Sawyer', 'SawyerW', 'Somerst', 'StoneBr', 'Timber', 'Veenker']], 'Neighborhood_CollgCr': ['Neighborhood', 'CollgCr', ['Blmngtn', 'Blueste', 'BrDale', 'BrkSide', 'ClearCr', 'CollgCr', 'Crawfor', 'Edwards', 'Gilbert', 'IDOTRR', 'MeadowV', 'Mitchel', 'NAmes', 'NPkVill', 'NWAmes', 'NoRidge', 'NridgHt', 'OldTown', 'SWISU', 'Sawyer', 'SawyerW', 'Somerst', 'StoneBr', 'Timber', 'Veenker']], 'Neighborhood_Crawfor': ['Neighborhood', 'Crawfor', ['Blmngtn', 'Blueste', 'BrDale', 'BrkSide', 'ClearCr', 'CollgCr', 'Crawfor', 'Edwards', 'Gilbert', 'IDOTRR', 'MeadowV', 'Mitchel', 'NAmes', 'NPkVill', 'NWAmes', 'NoRidge', 'NridgHt', 'OldTown', 'SWISU', 'Sawyer', 'SawyerW', 'Somerst', 'StoneBr', 'Timber', 'Veenker']], 'Neighborhood_Edwards': ['Neighborhood', 'Edwards', ['Blmngtn', 'Blueste', 'BrDale', 'BrkSide', 'ClearCr', 'CollgCr', 'Crawfor', 'Edwards', 'Gilbert', 'IDOTRR', 'MeadowV', 'Mitchel', 'NAmes', 'NPkVill', 'NWAmes', 'NoRidge', 'NridgHt', 'OldTown', 'SWISU', 'Sawyer', 'SawyerW', 'Somerst', 'StoneBr', 'Timber', 'Veenker']], 'Neighborhood_Gilbert': ['Neighborhood', 'Gilbert', ['Blmngtn', 'Blueste', 'BrDale', 'BrkSide', 'ClearCr', 'CollgCr', 'Crawfor', 'Edwards', 'Gilbert', 'IDOTRR', 'MeadowV', 'Mitchel', 'NAmes', 'NPkVill', 'NWAmes', 'NoRidge', 'NridgHt', 'OldTown', 'SWISU', 'Sawyer', 'SawyerW', 'Somerst', 'StoneBr', 'Timber', 'Veenker']], 'Neighborhood_IDOTRR': ['Neighborhood', 'IDOTRR', ['Blmngtn', 'Blueste', 'BrDale', 'BrkSide', 'ClearCr', 'CollgCr', 'Crawfor', 'Edwards', 'Gilbert', 'IDOTRR', 'MeadowV', 'Mitchel', 'NAmes', 'NPkVill', 'NWAmes', 'NoRidge', 'NridgHt', 'OldTown', 'SWISU', 'Sawyer', 'SawyerW', 'Somerst', 'StoneBr', 'Timber', 'Veenker']], 'Neighborhood_MeadowV': ['Neighborhood', 'MeadowV', ['Blmngtn', 'Blueste', 'BrDale', 'BrkSide', 'ClearCr', 'CollgCr', 'Crawfor', 'Edwards', 'Gilbert', 'IDOTRR', 'MeadowV', 'Mitchel', 'NAmes', 'NPkVill', 'NWAmes', 'NoRidge', 'NridgHt', 'OldTown', 'SWISU', 'Sawyer', 'SawyerW', 'Somerst', 'StoneBr', 'Timber', 'Veenker']], 'Neighborhood_Mitchel': ['Neighborhood', 'Mitchel', ['Blmngtn', 'Blueste', 'BrDale', 'BrkSide', 'ClearCr', 'CollgCr', 'Crawfor', 'Edwards', 'Gilbert', 'IDOTRR', 'MeadowV', 'Mitchel', 'NAmes', 'NPkVill', 'NWAmes', 'NoRidge', 'NridgHt', 'OldTown', 'SWISU', 'Sawyer', 'SawyerW', 'Somerst', 'StoneBr', 'Timber', 'Veenker']], 'Neighborhood_NAmes': ['Neighborhood', 'NAmes', ['Blmngtn', 'Blueste', 'BrDale', 'BrkSide', 'ClearCr', 'CollgCr', 'Crawfor', 'Edwards', 'Gilbert', 'IDOTRR', 'MeadowV', 'Mitchel', 'NAmes', 'NPkVill', 'NWAmes', 'NoRidge', 'NridgHt', 'OldTown', 'SWISU', 'Sawyer', 'SawyerW', 'Somerst', 'StoneBr', 'Timber', 'Veenker']], 'Neighborhood_NPkVill': ['Neighborhood', 'NPkVill', ['Blmngtn', 'Blueste', 'BrDale', 'BrkSide', 'ClearCr', 'CollgCr', 'Crawfor', 'Edwards', 'Gilbert', 'IDOTRR', 'MeadowV', 'Mitchel', 'NAmes', 'NPkVill', 'NWAmes', 'NoRidge', 'NridgHt', 'OldTown', 'SWISU', 'Sawyer', 'SawyerW', 'Somerst', 'StoneBr', 'Timber', 'Veenker']], 'Neighborhood_NWAmes': ['Neighborhood', 'NWAmes', ['Blmngtn', 'Blueste', 'BrDale', 'BrkSide', 'ClearCr', 'CollgCr', 'Crawfor', 'Edwards', 'Gilbert', 'IDOTRR', 'MeadowV', 'Mitchel', 'NAmes', 'NPkVill', 'NWAmes', 'NoRidge', 'NridgHt', 'OldTown', 'SWISU', 'Sawyer', 'SawyerW', 'Somerst', 'StoneBr', 'Timber', 'Veenker']], 'Neighborhood_NoRidge': ['Neighborhood', 'NoRidge', ['Blmngtn', 'Blueste', 'BrDale', 'BrkSide', 'ClearCr', 'CollgCr', 'Crawfor', 'Edwards', 'Gilbert', 'IDOTRR', 'MeadowV', 'Mitchel', 'NAmes', 'NPkVill', 'NWAmes', 'NoRidge', 'NridgHt', 'OldTown', 'SWISU', 'Sawyer', 'SawyerW', 'Somerst', 'StoneBr', 'Timber', 'Veenker']], 'Neighborhood_NridgHt': ['Neighborhood', 'NridgHt', ['Blmngtn', 'Blueste', 'BrDale', 'BrkSide', 'ClearCr', 'CollgCr', 'Crawfor', 'Edwards', 'Gilbert', 'IDOTRR', 'MeadowV', 'Mitchel', 'NAmes', 'NPkVill', 'NWAmes', 'NoRidge', 'NridgHt', 'OldTown', 'SWISU', 'Sawyer', 'SawyerW', 'Somerst', 'StoneBr', 'Timber', 'Veenker']], 'Neighborhood_OldTown': ['Neighborhood', 'OldTown', ['Blmngtn', 'Blueste', 'BrDale', 'BrkSide', 'ClearCr', 'CollgCr', 'Crawfor', 'Edwards', 'Gilbert', 'IDOTRR', 'MeadowV', 'Mitchel', 'NAmes', 'NPkVill', 'NWAmes', 'NoRidge', 'NridgHt', 'OldTown', 'SWISU', 'Sawyer', 'SawyerW', 'Somerst', 'StoneBr', 'Timber', 'Veenker']], 'Neighborhood_SWISU': ['Neighborhood', 'SWISU', ['Blmngtn', 'Blueste', 'BrDale', 'BrkSide', 'ClearCr', 'CollgCr', 'Crawfor', 'Edwards', 'Gilbert', 'IDOTRR', 'MeadowV', 'Mitchel', 'NAmes', 'NPkVill', 'NWAmes', 'NoRidge', 'NridgHt', 'OldTown', 'SWISU', 'Sawyer', 'SawyerW', 'Somerst', 'StoneBr', 'Timber', 'Veenker']], 'Neighborhood_Sawyer': ['Neighborhood', 'Sawyer', ['Blmngtn', 'Blueste', 'BrDale', 'BrkSide', 'ClearCr', 'CollgCr', 'Crawfor', 'Edwards', 'Gilbert', 'IDOTRR', 'MeadowV', 'Mitchel', 'NAmes', 'NPkVill', 'NWAmes', 'NoRidge', 'NridgHt', 'OldTown', 'SWISU', 'Sawyer', 'SawyerW', 'Somerst', 'StoneBr', 'Timber', 'Veenker']], 'Neighborhood_SawyerW': ['Neighborhood', 'SawyerW', ['Blmngtn', 'Blueste', 'BrDale', 'BrkSide', 'ClearCr', 'CollgCr', 'Crawfor', 'Edwards', 'Gilbert', 'IDOTRR', 'MeadowV', 'Mitchel', 'NAmes', 'NPkVill', 'NWAmes', 'NoRidge', 'NridgHt', 'OldTown', 'SWISU', 'Sawyer', 'SawyerW', 'Somerst', 'StoneBr', 'Timber', 'Veenker']], 'Neighborhood_Somerst': ['Neighborhood', 'Somerst', ['Blmngtn', 'Blueste', 'BrDale', 'BrkSide', 'ClearCr', 'CollgCr', 'Crawfor', 'Edwards', 'Gilbert', 'IDOTRR', 'MeadowV', 'Mitchel', 'NAmes', 'NPkVill', 'NWAmes', 'NoRidge', 'NridgHt', 'OldTown', 'SWISU', 'Sawyer', 'SawyerW', 'Somerst', 'StoneBr', 'Timber', 'Veenker']], 'Neighborhood_StoneBr': ['Neighborhood', 'StoneBr', ['Blmngtn', 'Blueste', 'BrDale', 'BrkSide', 'ClearCr', 'CollgCr', 'Crawfor', 'Edwards', 'Gilbert', 'IDOTRR', 'MeadowV', 'Mitchel', 'NAmes', 'NPkVill', 'NWAmes', 'NoRidge', 'NridgHt', 'OldTown', 'SWISU', 'Sawyer', 'SawyerW', 'Somerst', 'StoneBr', 'Timber', 'Veenker']], 'Neighborhood_Timber': ['Neighborhood', 'Timber', ['Blmngtn', 'Blueste', 'BrDale', 'BrkSide', 'ClearCr', 'CollgCr', 'Crawfor', 'Edwards', 'Gilbert', 'IDOTRR', 'MeadowV', 'Mitchel', 'NAmes', 'NPkVill', 'NWAmes', 'NoRidge', 'NridgHt', 'OldTown', 'SWISU', 'Sawyer', 'SawyerW', 'Somerst', 'StoneBr', 'Timber', 'Veenker']], 'Neighborhood_Veenker': ['Neighborhood', 'Veenker', ['Blmngtn', 'Blueste', 'BrDale', 'BrkSide', 'ClearCr', 'CollgCr', 'Crawfor', 'Edwards', 'Gilbert', 'IDOTRR', 'MeadowV', 'Mitchel', 'NAmes', 'NPkVill', 'NWAmes', 'NoRidge', 'NridgHt', 'OldTown', 'SWISU', 'Sawyer', 'SawyerW', 'Somerst', 'StoneBr', 'Timber', 'Veenker']], 'Condition1_Artery': ['Condition1', 'Artery', ['Artery', 'Feedr', 'Norm', 'PosA', 'PosN', 'RRAe', 'RRAn', 'RRNe', 'RRNn']], 'Condition1_Feedr': ['Condition1', 'Feedr', ['Artery', 'Feedr', 'Norm', 'PosA', 'PosN', 'RRAe', 'RRAn', 'RRNe', 'RRNn']], 'Condition1_Norm': ['Condition1', 'Norm', ['Artery', 'Feedr', 'Norm', 'PosA', 'PosN', 'RRAe', 'RRAn', 'RRNe', 'RRNn']], 'Condition1_PosA': ['Condition1', 'PosA', ['Artery', 'Feedr', 'Norm', 'PosA', 'PosN', 'RRAe', 'RRAn', 'RRNe', 'RRNn']], 'Condition1_PosN': ['Condition1', 'PosN', ['Artery', 'Feedr', 'Norm', 'PosA', 'PosN', 'RRAe', 'RRAn', 'RRNe', 'RRNn']], 'Condition1_RRAe': ['Condition1', 'RRAe', ['Artery', 'Feedr', 'Norm', 'PosA', 'PosN', 'RRAe', 'RRAn', 'RRNe', 'RRNn']], 'Condition1_RRAn': ['Condition1', 'RRAn', ['Artery', 'Feedr', 'Norm', 'PosA', 'PosN', 'RRAe', 'RRAn', 'RRNe', 'RRNn']], 'Condition1_RRNe': ['Condition1', 'RRNe', ['Artery', 'Feedr', 'Norm', 'PosA', 'PosN', 'RRAe', 'RRAn', 'RRNe', 'RRNn']], 'Condition1_RRNn': ['Condition1', 'RRNn', ['Artery', 'Feedr', 'Norm', 'PosA', 'PosN', 'RRAe', 'RRAn', 'RRNe', 'RRNn']], 'Condition2_Artery': ['Condition2', 'Artery', ['Artery', 'Feedr', 'Norm', 'PosA', 'PosN', 'RRAe', 'RRAn', 'RRNn']], 'Condition2_Feedr': ['Condition2', 'Feedr', ['Artery', 'Feedr', 'Norm', 'PosA', 'PosN', 'RRAe', 'RRAn', 'RRNn']], 'Condition2_Norm': ['Condition2', 'Norm', ['Artery', 'Feedr', 'Norm', 'PosA', 'PosN', 'RRAe', 'RRAn', 'RRNn']], 'Condition2_PosA': ['Condition2', 'PosA', ['Artery', 'Feedr', 'Norm', 'PosA', 'PosN', 'RRAe', 'RRAn', 'RRNn']], 'Condition2_PosN': ['Condition2', 'PosN', ['Artery', 'Feedr', 'Norm', 'PosA', 'PosN', 'RRAe', 'RRAn', 'RRNn']], 'Condition2_RRAe': ['Condition2', 'RRAe', ['Artery', 'Feedr', 'Norm', 'PosA', 'PosN', 'RRAe', 'RRAn', 'RRNn']], 'Condition2_RRAn': ['Condition2', 'RRAn', ['Artery', 'Feedr', 'Norm', 'PosA', 'PosN', 'RRAe', 'RRAn', 'RRNn']], 'Condition2_RRNn': ['Condition2', 'RRNn', ['Artery', 'Feedr', 'Norm', 'PosA', 'PosN', 'RRAe', 'RRAn', 'RRNn']], 'BldgType_1Fam': ['BldgType', '1Fam', ['1Fam', '2fmCon', 'Duplex', 'Twnhs', 'TwnhsE']], 'BldgType_2fmCon': ['BldgType', '2fmCon', ['1Fam', '2fmCon', 'Duplex', 'Twnhs', 'TwnhsE']], 'BldgType_Duplex': ['BldgType', 'Duplex', ['1Fam', '2fmCon', 'Duplex', 'Twnhs', 'TwnhsE']], 'BldgType_Twnhs': ['BldgType', 'Twnhs', ['1Fam', '2fmCon', 'Duplex', 'Twnhs', 'TwnhsE']], 'BldgType_TwnhsE': ['BldgType', 'TwnhsE', ['1Fam', '2fmCon', 'Duplex', 'Twnhs', 'TwnhsE']], 'HouseStyle_1.5Fin': ['HouseStyle', '1.5Fin', ['1.5Fin', '1.5Unf', '1Story', '2.5Fin', '2.5Unf', '2Story', 'SFoyer', 'SLvl']], 'HouseStyle_1.5Unf': ['HouseStyle', '1.5Unf', ['1.5Fin', '1.5Unf', '1Story', '2.5Fin', '2.5Unf', '2Story', 'SFoyer', 'SLvl']], 'HouseStyle_1Story': ['HouseStyle', '1Story', ['1.5Fin', '1.5Unf', '1Story', '2.5Fin', '2.5Unf', '2Story', 'SFoyer', 'SLvl']], 'HouseStyle_2.5Fin': ['HouseStyle', '2.5Fin', ['1.5Fin', '1.5Unf', '1Story', '2.5Fin', '2.5Unf', '2Story', 'SFoyer', 'SLvl']], 'HouseStyle_2.5Unf': ['HouseStyle', '2.5Unf', ['1.5Fin', '1.5Unf', '1Story', '2.5Fin', '2.5Unf', '2Story', 'SFoyer', 'SLvl']], 'HouseStyle_2Story': ['HouseStyle', '2Story', ['1.5Fin', '1.5Unf', '1Story', '2.5Fin', '2.5Unf', '2Story', 'SFoyer', 'SLvl']], 'HouseStyle_SFoyer': ['HouseStyle', 'SFoyer', ['1.5Fin', '1.5Unf', '1Story', '2.5Fin', '2.5Unf', '2Story', 'SFoyer', 'SLvl']], 'HouseStyle_SLvl': ['HouseStyle', 'SLvl', ['1.5Fin', '1.5Unf', '1Story', '2.5Fin', '2.5Unf', '2Story', 'SFoyer', 'SLvl']], 'OverallQual_1': ['OverallQual', 1, [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]], 'OverallQual_2': ['OverallQual', 2, [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]], 'OverallQual_3': ['OverallQual', 3, [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]], 'OverallQual_4': ['OverallQual', 4, [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]], 'OverallQual_5': ['OverallQual', 5, [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]], 'OverallQual_6': ['OverallQual', 6, [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]], 'OverallQual_7': ['OverallQual', 7, [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]], 'OverallQual_8': ['OverallQual', 8, [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]], 'OverallQual_9': ['OverallQual', 9, [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]], 'OverallQual_10': ['OverallQual', 10, [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]], 'OverallCond_1': ['OverallCond', 1, [1, 2, 3, 4, 5, 6, 7, 8, 9]], 'OverallCond_2': ['OverallCond', 2, [1, 2, 3, 4, 5, 6, 7, 8, 9]], 'OverallCond_3': ['OverallCond', 3, [1, 2, 3, 4, 5, 6, 7, 8, 9]], 'OverallCond_4': ['OverallCond', 4, [1, 2, 3, 4, 5, 6, 7, 8, 9]], 'OverallCond_5': ['OverallCond', 5, [1, 2, 3, 4, 5, 6, 7, 8, 9]], 'OverallCond_6': ['OverallCond', 6, [1, 2, 3, 4, 5, 6, 7, 8, 9]], 'OverallCond_7': ['OverallCond', 7, [1, 2, 3, 4, 5, 6, 7, 8, 9]], 'OverallCond_8': ['OverallCond', 8, [1, 2, 3, 4, 5, 6, 7, 8, 9]], 'OverallCond_9': ['OverallCond', 9, [1, 2, 3, 4, 5, 6, 7, 8, 9]], 'RoofStyle_Flat': ['RoofStyle', 'Flat', ['Flat', 'Gable', 'Gambrel', 'Hip', 'Mansard', 'Shed']], 'RoofStyle_Gable': ['RoofStyle', 'Gable', ['Flat', 'Gable', 'Gambrel', 'Hip', 'Mansard', 'Shed']], 'RoofStyle_Gambrel': ['RoofStyle', 'Gambrel', ['Flat', 'Gable', 'Gambrel', 'Hip', 'Mansard', 'Shed']], 'RoofStyle_Hip': ['RoofStyle', 'Hip', ['Flat', 'Gable', 'Gambrel', 'Hip', 'Mansard', 'Shed']], 'RoofStyle_Mansard': ['RoofStyle', 'Mansard', ['Flat', 'Gable', 'Gambrel', 'Hip', 'Mansard', 'Shed']], 'RoofStyle_Shed': ['RoofStyle', 'Shed', ['Flat', 'Gable', 'Gambrel', 'Hip', 'Mansard', 'Shed']], 'RoofMatl_ClyTile': ['RoofMatl', 'ClyTile', ['ClyTile', 'CompShg', 'Membran', 'Metal', 'Roll', 'Tar&Grv', 'WdShake', 'WdShngl']], 'RoofMatl_CompShg': ['RoofMatl', 'CompShg', ['ClyTile', 'CompShg', 'Membran', 'Metal', 'Roll', 'Tar&Grv', 'WdShake', 'WdShngl']], 'RoofMatl_Membran': ['RoofMatl', 'Membran', ['ClyTile', 'CompShg', 'Membran', 'Metal', 'Roll', 'Tar&Grv', 'WdShake', 'WdShngl']], 'RoofMatl_Metal': ['RoofMatl', 'Metal', ['ClyTile', 'CompShg', 'Membran', 'Metal', 'Roll', 'Tar&Grv', 'WdShake', 'WdShngl']], 'RoofMatl_Roll': ['RoofMatl', 'Roll', ['ClyTile', 'CompShg', 'Membran', 'Metal', 'Roll', 'Tar&Grv', 'WdShake', 'WdShngl']], 'RoofMatl_Tar&Grv': ['RoofMatl', 'Tar&Grv', ['ClyTile', 'CompShg', 'Membran', 'Metal', 'Roll', 'Tar&Grv', 'WdShake', 'WdShngl']], 'RoofMatl_WdShake': ['RoofMatl', 'WdShake', ['ClyTile', 'CompShg', 'Membran', 'Metal', 'Roll', 'Tar&Grv', 'WdShake', 'WdShngl']], 'RoofMatl_WdShngl': ['RoofMatl', 'WdShngl', ['ClyTile', 'CompShg', 'Membran', 'Metal', 'Roll', 'Tar&Grv', 'WdShake', 'WdShngl']], 'ExterQual_Ex': ['ExterQual', 'Ex', ['Ex', 'Fa', 'Gd', 'TA']], 'ExterQual_Fa': ['ExterQual', 'Fa', ['Ex', 'Fa', 'Gd', 'TA']], 'ExterQual_Gd': ['ExterQual', 'Gd', ['Ex', 'Fa', 'Gd', 'TA']], 'ExterQual_TA': ['ExterQual', 'TA', ['Ex', 'Fa', 'Gd', 'TA']], 'ExterCond_Ex': ['ExterCond', 'Ex', ['Ex', 'Fa', 'Gd', 'Po', 'TA']], 'ExterCond_Fa': ['ExterCond', 'Fa', ['Ex', 'Fa', 'Gd', 'Po', 'TA']], 'ExterCond_Gd': ['ExterCond', 'Gd', ['Ex', 'Fa', 'Gd', 'Po', 'TA']], 'ExterCond_Po': ['ExterCond', 'Po', ['Ex', 'Fa', 'Gd', 'Po', 'TA']], 'ExterCond_TA': ['ExterCond', 'TA', ['Ex', 'Fa', 'Gd', 'Po', 'TA']], 'Foundation_BrkTil': ['Foundation', 'BrkTil', ['BrkTil', 'CBlock', 'PConc', 'Slab', 'Stone', 'Wood']], 'Foundation_CBlock': ['Foundation', 'CBlock', ['BrkTil', 'CBlock', 'PConc', 'Slab', 'Stone', 'Wood']], 'Foundation_PConc': ['Foundation', 'PConc', ['BrkTil', 'CBlock', 'PConc', 'Slab', 'Stone', 'Wood']], 'Foundation_Slab': ['Foundation', 'Slab', ['BrkTil', 'CBlock', 'PConc', 'Slab', 'Stone', 'Wood']], 'Foundation_Stone': ['Foundation', 'Stone', ['BrkTil', 'CBlock', 'PConc', 'Slab', 'Stone', 'Wood']], 'Foundation_Wood': ['Foundation', 'Wood', ['BrkTil', 'CBlock', 'PConc', 'Slab', 'Stone', 'Wood']], 'Heating_Floor': ['Heating', 'Floor', ['Floor', 'GasA', 'GasW', 'Grav', 'OthW', 'Wall']], 'Heating_GasA': ['Heating', 'GasA', ['Floor', 'GasA', 'GasW', 'Grav', 'OthW', 'Wall']], 'Heating_GasW': ['Heating', 'GasW', ['Floor', 'GasA', 'GasW', 'Grav', 'OthW', 'Wall']], 'Heating_Grav': ['Heating', 'Grav', ['Floor', 'GasA', 'GasW', 'Grav', 'OthW', 'Wall']], 'Heating_OthW': ['Heating', 'OthW', ['Floor', 'GasA', 'GasW', 'Grav', 'OthW', 'Wall']], 'Heating_Wall': ['Heating', 'Wall', ['Floor', 'GasA', 'GasW', 'Grav', 'OthW', 'Wall']], 'HeatingQC_Ex': ['HeatingQC', 'Ex', ['Ex', 'Fa', 'Gd', 'Po', 'TA']], 'HeatingQC_Fa': ['HeatingQC', 'Fa', ['Ex', 'Fa', 'Gd', 'Po', 'TA']], 'HeatingQC_Gd': ['HeatingQC', 'Gd', ['Ex', 'Fa', 'Gd', 'Po', 'TA']], 'HeatingQC_Po': ['HeatingQC', 'Po', ['Ex', 'Fa', 'Gd', 'Po', 'TA']], 'HeatingQC_TA': ['HeatingQC', 'TA', ['Ex', 'Fa', 'Gd', 'Po', 'TA']], 'PavedDrive_N': ['PavedDrive', 'N', ['N', 'P', 'Y']], 'PavedDrive_P': ['PavedDrive', 'P', ['N', 'P', 'Y']], 'PavedDrive_Y': ['PavedDrive', 'Y', ['N', 'P', 'Y']], 'SaleCondition_Abnorml': ['SaleCondition', 'Abnorml', ['Abnorml', 'AdjLand', 'Alloca', 'Family', 'Normal', 'Partial']], 'SaleCondition_AdjLand': ['SaleCondition', 'AdjLand', ['Abnorml', 'AdjLand', 'Alloca', 'Family', 'Normal', 'Partial']], 'SaleCondition_Alloca': ['SaleCondition', 'Alloca', ['Abnorml', 'AdjLand', 'Alloca', 'Family', 'Normal', 'Partial']], 'SaleCondition_Family': ['SaleCondition', 'Family', ['Abnorml', 'AdjLand', 'Alloca', 'Family', 'Normal', 'Partial']], 'SaleCondition_Normal': ['SaleCondition', 'Normal', ['Abnorml', 'AdjLand', 'Alloca', 'Family', 'Normal', 'Partial']], 'SaleCondition_Partial': ['SaleCondition', 'Partial', ['Abnorml', 'AdjLand', 'Alloca', 'Family', 'Normal', 'Partial']]}\n", "\n", "Number of used features in the model (before the encoding): 44\n", "Number of used features in the model (after the encoding): 153\n", "----------------------------------------------\n", "instance: [ 0 0 0 0 0 1 0 0 0 0 0 0 0 0\n", " 0 0 8450 1 0 0 0 1 0 0 0 1 0 0\n", " 0 0 1 1 0 0 0 0 0 0 0 1 0 0\n", " 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n", " 0 0 0 0 0 1 0 0 0 0 0 0 0 0\n", " 1 0 0 0 0 0 1 0 0 0 0 0 0 0\n", " 0 0 1 0 0 0 0 0 0 0 0 1 0 0\n", " 0 0 0 0 0 1 0 0 0 0 2003 2003 0 1\n", " 0 0 0 0 0 1 0 0 0 0 0 0 0 0\n", " 1 0 0 0 0 0 1 0 0 1 0 0 0 0\n", " 1 0 0 0 0 1 0 0 0 0 1 856 854 0\n", " 1710 2 1 3 1 8 0 0 0 1 0 61 0 0\n", " 0 0 0 2 2008 0 0 0 0 1 0]\n", "prediction: 199248.22\n", "delta1: 46155.7895795875\n", "delta2: 92311.579159175\n", "delta3: 184623.15831835\n", "delta4: 369246.3166367\n" ] }, { "ename": "SystemExit", "evalue": "0", "output_type": "error", "traceback": [ "An exception has occurred, use %tb to see the full traceback.\n", "\u001b[0;31mSystemExit\u001b[0m\u001b[0;31m:\u001b[0m 0\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "To exit: use 'exit', 'quit', or Ctrl-D.\n" ] } ], "source": [ "explainer = Explainer.initialize(model, instance, features_type=\"../../dataset/houses-prices-converted_0.types\")\n", "print(\"instance:\", instance)\n", "print(\"prediction:\", prediction)\n", "\n", "def compute_delta(percent):\n", " extremum_range = explainer.extremum_range()\n", " difference = extremum_range[1] - extremum_range[0]\n", " return (percent/100)*difference \n", "\n", "delta1 = compute_delta(2.5)\n", "explainer.set_interval(prediction - delta1, prediction + delta1)\n", "print(\"delta1:\", delta1)\n", "tree_specific_reason = explainer.tree_specific_reason()\n", "\n", "delta2 = compute_delta(5)\n", "explainer.set_interval(prediction - delta2, prediction + delta2)\n", "print(\"delta2:\", delta2)\n", "tree_specific_reason = explainer.tree_specific_reason()\n", "\n", "delta3 = compute_delta(10)\n", "explainer.set_interval(prediction - delta3, prediction + delta3)\n", "print(\"delta3:\", delta3)\n", "tree_specific_reason = explainer.tree_specific_reason()\n", "\n", "delta4 = compute_delta(20)\n", "explainer.set_interval(prediction - delta4, prediction + delta4)\n", "print(\"delta4:\", delta4)\n", "tree_specific_reason = explainer.tree_specific_reason()\n", "\n", "explainer.visualisation.gui()" ] }, { "cell_type": "markdown", "id": "e5e77f26", "metadata": {}, "source": [ "In order to set correctly the intervals, we use the ```compute_delta``` function in order to compute the delta values. A delta value is the quantity to be removed or added to calculate the interval $[a,b]$ as a percentage of the model extreme regression values (minimal and maximal values of possible regression values). For example, the prediction for this instance is $199248.22$, delta1 $\\approx 46155$ and the interval for the first tree-specific explanation reported above is $[a,b] = [199248.22 - 46155, 199248.22 + 46155]$. This interval corresponds to $2.5\\%$ of values in relation to the model extreme regression minimal and maximal values.\n", "\n", "The results are presented in the PyXAI's GUI thanks to the last lines of ```explainer.visualisation.gui()```\n", "\n", "\"BTTS2\"" ] }, { "cell_type": "markdown", "id": "deb19745", "metadata": {}, "source": [ "We can observe that the larger the interval I (the percentage and the delta value), the smaller the reason, both in terms of the number of binary variables and the number of features: \n", "- For $2.5\\%$, the tree specific explanation has $45$ binary variables and $34$ features \n", "- For $5\\%$, the tree specific explanation has $44$ binary variables and $32$ features\n", "- For $10\\%$, the tree specific explanation has $35$ binary variables and $27$ features\n", "- For $20\\%$, the tree specific explanation has $20$ binary variables and $15$ features" ] }, { "cell_type": "markdown", "id": "3c79111f", "metadata": {}, "source": [ "This observation applies to all datasets. The larger the interval, the smaller the explanation. \n", "We therefore recommend testing several possible intervals, depending on the problem and possible regression values. " ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.6" } }, "nbformat": 4, "nbformat_minor": 5 }