{ "cells": [ { "cell_type": "markdown", "id": "95a492d4", "metadata": {}, "source": [ "# Quick Start" ] }, { "cell_type": "markdown", "id": "80685d54", "metadata": {}, "source": [ "{: .attention }\n", "> Once pyxai has been installed, you can use these commands:\n", "> \n", "> ```python3 ```: Execute a python file with lines of code using PyXAI\\\\\n", "> ```python3 -m pyxai -gui```: Open the PyXAI's Graphical User Interface\\\\\n", "> ```python3 -m pyxai -explanations```: Copy the explanations backups of GUI in your current directory\\\\ \n", "> ```python3 -m pyxai -examples```: Copy the examples in your current directory \n", "\n", "Let us give a quick illustration of PyXAI, showing how to compute explanations given an ML model. \n", "\"Iris\"" ] }, { "cell_type": "markdown", "id": "f898fcd6", "metadata": {}, "source": [ "The first thing to do is to import the components of PyXAI. In order to import only the necessary methods into a project, PyXAI is composed of three distinct modules: ```Learning```, ```Explaining```, and ```Tools```." ] }, { "cell_type": "code", "execution_count": 1, "id": "1500590b", "metadata": { "execution": { "iopub.execute_input": "2026-05-12T09:52:25.627168Z", "iopub.status.busy": "2026-05-12T09:52:25.627078Z", "iopub.status.idle": "2026-05-12T09:52:27.846564Z", "shell.execute_reply": "2026-05-12T09:52:27.846039Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Warning: the --f=/run/user/1000/jupyter/runtime/kernel-v3aa3658209bc0c1688c02ec1e5c1930e136688b36.json option is not a PyXAI option.\n" ] } ], "source": [ "from pyxai import Learning, Explaining, Tools" ] }, { "cell_type": "markdown", "id": "ca4d6513", "metadata": {}, "source": [ "If you encounter a problem, this is certainly because you need the python package PyXAI to be installed on your system. You need to execute a command like ```python3 -m pip install pyxai```. See the [Installation](/documentation/installation) page for details." ] }, { "cell_type": "markdown", "id": "8dff254a", "metadata": {}, "source": [ "In most situations, the use of PyXAI library requires two successive steps: first the generation of an ML model from a dataset (with the ```Learning``` module) and second, given the learned model, the computation of explanations for some instances (using the ```Explaining``` module). " ] }, { "cell_type": "markdown", "id": "a5714f78", "metadata": {}, "source": [ "## Machine Learning" ] }, { "cell_type": "markdown", "id": "aa225621", "metadata": {}, "source": [ "For this example, we want to create a decision tree classifier for the iris dataset using [Scikit-learn](https://scikit-learn.org/stable/)." ] }, { "cell_type": "code", "execution_count": 2, "id": "bc48a391", "metadata": { "execution": { "iopub.execute_input": "2026-05-12T09:52:27.848493Z", "iopub.status.busy": "2026-05-12T09:52:27.848293Z", "iopub.status.idle": "2026-05-12T09:52:27.852587Z", "shell.execute_reply": "2026-05-12T09:52:27.852265Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-------------- Information ---------------\n", "Problem type: classification\n", "Instances type: tabular\n", "Labels type: classes\n", "\n", "Dataset path: ../dataset/iris.csv\n", "nFeatures (nAttributes, with the labels): 4\n", "nInstances (nObservations): 150\n", "nLabels: 3\n" ] } ], "source": [ "learner = Learning.Scikitlearn(\"../dataset/iris.csv\", problem_type='classification')" ] }, { "cell_type": "markdown", "id": "d12124c7", "metadata": {}, "source": [ "It is possible to download this dataset from the [UCI Machine Learning Repository -- Iris Data Set](http://archive.ics.uci.edu/ml/datasets/Iris) or [here](/assets/notebooks/dataset/iris.csv). In our case, it is located in the directory ```../dataset```. The parameter ```problem_type='classification'``` specifies a classification task. The Iris Dataset contains four features (length and width of sepals and petals) of 50 samples of three species of Iris (Iris setosa, Iris virginica and Iris versicolor). The goal of the classifier is to find the right outcome for an instance among three classes: setosa, virginica, versicolor. " ] }, { "cell_type": "markdown", "id": "d54c678e", "metadata": {}, "source": [ "To create models, PyXAI implements methods to directly run an ML experimental protocol (with the train-test split technique). Several cross-validation methods (```Learning.HOLD_OUT```, ```Learning.K_FOLDS```, ```Learning.LEAVE_ONE_GROUP_OUT```) and models (```Learning.DT```, ```Learning.RF```, ```Learning.BT```) are available. \n", "\n", "In this example, we compute a Decision Tree (see the parameter ```model_type=Learning.DT```)." ] }, { "cell_type": "code", "execution_count": 3, "id": "0c692233", "metadata": { "execution": { "iopub.execute_input": "2026-05-12T09:52:27.853635Z", "iopub.status.busy": "2026-05-12T09:52:27.853544Z", "iopub.status.idle": "2026-05-12T09:52:27.866820Z", "shell.execute_reply": "2026-05-12T09:52:27.866449Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "--------------- Model creation, fitting and evaluation ---------------\n", "Splitting method: hold-out\n", "Problem type: classification\n", "Models type: decision-tree\n", "model_parameters: {'random_state': 0}\n", "--------- Evaluation Information ---------\n", "For the evaluation number 0:\n", "Metrics:\n", " micro_averaging_accuracy: 98.24561403508771\n", " micro_averaging_precision: 97.36842105263158\n", " micro_averaging_recall: 97.36842105263158\n", " macro_averaging_accuracy: 98.24561403508773\n", " macro_averaging_precision: 96.66666666666667\n", " macro_averaging_recall: 97.91666666666666\n", " true_positives: {'Iris-setosa': 13, 'Iris-versicolor': 15, 'Iris-virginica': 9}\n", " true_negatives: {'Iris-setosa': 25, 'Iris-versicolor': 22, 'Iris-virginica': 28}\n", " false_positives: {'Iris-setosa': 0, 'Iris-versicolor': 0, 'Iris-virginica': 1}\n", " false_negatives: {'Iris-setosa': 0, 'Iris-versicolor': 1, 'Iris-virginica': 0}\n", " accuracy: 97.36842105263158\n", " sklearn_confusion_matrix: [[13, 0, 0], [0, 15, 1], [0, 0, 9]]\n", "Number of Training instances: 112\n", "Number of Testing instances: 38\n", "\n", "--------------- Explainer ----------------\n", "For the split number 0:\n", "**Decision Tree Model**\n", "nFeatures: 4\n", "nNodes: 6\n", "nVariables: 5\n", "\n" ] } ], "source": [ "model = learner.evaluate(splitting_method=Learning.HOLD_OUT, model_type=Learning.DT, model_parameters={'random_state': 0}, splitting_parameters={'random_state': 0})" ] }, { "cell_type": "markdown", "id": "1e45e421", "metadata": {}, "source": [ "Once the model is created, we select an instance in order to be able to derive explanations. Here, a well-classified instance is chosen: the model predicts the first class ```0``` (i.e. the Iris setosa class) thanks to the ```is_correct=True``` and the ```subset_predicted_classes=[\"Iris-setosa\"]``` parameters. " ] }, { "cell_type": "code", "execution_count": 4, "id": "2577c3ec", "metadata": { "execution": { "iopub.execute_input": "2026-05-12T09:52:27.868021Z", "iopub.status.busy": "2026-05-12T09:52:27.867924Z", "iopub.status.idle": "2026-05-12T09:52:27.871854Z", "shell.execute_reply": "2026-05-12T09:52:27.871397Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "--------------- Instances ----------------\n", "Number of instances selected: 1\n", "----------------------------------------------\n" ] } ], "source": [ "instance, prediction = learner.get_instances(model, n=1, is_correct=True, subset_predicted_classes=[\"Iris-setosa\"], seed=2)" ] }, { "cell_type": "markdown", "id": "5372cec4", "metadata": {}, "source": [ "Please consult the [Learning](/documentation/learning/generating) page for more details about this ML part. " ] }, { "cell_type": "markdown", "id": "23b4ec8b", "metadata": {}, "source": [ "## Explainer" ] }, { "cell_type": "markdown", "id": "5b4a98b8", "metadata": {}, "source": [ "The ```Explainer``` module contains different methods to generate explanations. For this purpose, the model and the target instance are defined as parameters of the ```initialize``` function of this module. " ] }, { "cell_type": "code", "execution_count": 5, "id": "7d7e859c", "metadata": { "execution": { "iopub.execute_input": "2026-05-12T09:52:27.872996Z", "iopub.status.busy": "2026-05-12T09:52:27.872902Z", "iopub.status.idle": "2026-05-12T09:52:27.875109Z", "shell.execute_reply": "2026-05-12T09:52:27.874770Z" } }, "outputs": [], "source": [ "explainer = Explaining.initialize(model, instance)" ] }, { "cell_type": "markdown", "id": "5257e4b0", "metadata": {}, "source": [ "The ```initialize``` function converts the instance into binary variables (called a binary representation) coding the associated model. More precisely, each of these binary variables represents a condition (feature $op$ value ?) in the model where $op$ is a standard comparison operator. [Scikit-learn](https://scikit-learn.org/stable/) and [XGBoost](https://xgboost.readthedocs.io/en/stable/) use the operator $\\ge$. With respect to the instance, the sign of a binary variable indicates whether the condition is true or not in the model. Here, we can see the instance and its binary representation. We can see the conditions related to the binary representation using the function ```to_features``` which is explained below." ] }, { "cell_type": "code", "execution_count": 6, "id": "4ce31e13", "metadata": { "execution": { "iopub.execute_input": "2026-05-12T09:52:27.876137Z", "iopub.status.busy": "2026-05-12T09:52:27.876049Z", "iopub.status.idle": "2026-05-12T09:52:27.878531Z", "shell.execute_reply": "2026-05-12T09:52:27.878181Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "instance: Sepal.Length 5.0\n", "Sepal.Width 3.3\n", "Petal.Length 1.4\n", "Petal.Width 0.2\n", "Name: 49, dtype: float64\n", "binary representation: (-1, -2, -3, 4, -5)\n", "conditions related to the binary representation: ['Sepal.Width > 3.100000023841858', 'Petal.Length <= 4.950000047683716', 'Petal.Width <= 0.800000011920929', 'Petal.Width <= 1.6500000357627869', 'Petal.Width <= 1.75']\n" ] } ], "source": [ "print(\"instance:\", instance)\n", "print(\"binary representation:\", explainer.binary_representation)\n", "print(\"conditions related to the binary representation:\", explainer.to_features(explainer.binary_representation,eliminate_redundant_features=False))" ] }, { "cell_type": "markdown", "id": "28df2c0f", "metadata": {}, "source": [ "We notice that the binary representation of this instance contains more than 4 variables because the decision tree of the model is composed of five nodes (binary variables). Indeed, the feature Petal.Width appears 3 times whereas Sepal.Length is useless. Please see the [concepts](/documentation/explainer/concepts/) page for more information on binary representations." ] }, { "cell_type": "markdown", "id": "5cc902e6", "metadata": {}, "source": [ "It is also possible to display a more compact representation by setting ```eliminate_redundant_features=True``` (the default value), which removes redundant conditions on the same feature:" ] }, { "cell_type": "code", "execution_count": 7, "id": "8f0df423", "metadata": { "execution": { "iopub.execute_input": "2026-05-12T09:52:27.879549Z", "iopub.status.busy": "2026-05-12T09:52:27.879451Z", "iopub.status.idle": "2026-05-12T09:52:27.881288Z", "shell.execute_reply": "2026-05-12T09:52:27.880978Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "compact representation: ['Sepal.Width > 3.100000023841858', 'Petal.Length <= 4.950000047683716', 'Petal.Width <= 0.800000011920929']\n" ] } ], "source": [ "print(\"compact representation:\", explainer.to_features(explainer.binary_representation, eliminate_redundant_features=True))" ] }, { "cell_type": "markdown", "id": "62766926", "metadata": {}, "source": [ "### Abductive explanations" ] }, { "cell_type": "markdown", "id": "ea0e27bc", "metadata": {}, "source": [ "In PyXAI, several types of explanation are available. In their binary forms representing conditions, these are called reasons. In our example, we choose to compute one of the most popular type of explanations: a sufficient reason. A sufficient reason is an abductive explanation (any other instance X' sharing the conditions of this reason is classified by the model as X is) for which no proper subset of this reason is a sufficient reason (i.e., the explanation is minimal with respect to set inclusion). " ] }, { "cell_type": "code", "execution_count": 8, "id": "3929b134", "metadata": { "execution": { "iopub.execute_input": "2026-05-12T09:52:27.882294Z", "iopub.status.busy": "2026-05-12T09:52:27.882208Z", "iopub.status.idle": "2026-05-12T09:52:27.884232Z", "shell.execute_reply": "2026-05-12T09:52:27.883911Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "sufficient_reason: (-1,)\n" ] } ], "source": [ "sufficient_reason = explainer.sufficient_reason(n=1)\n", "print(\"sufficient_reason:\", sufficient_reason)" ] }, { "cell_type": "markdown", "id": "b8e5f6bc", "metadata": {}, "source": [ "We can get the features involved in the reason thanks to the method ```to_features```:" ] }, { "cell_type": "code", "execution_count": 9, "id": "b297f93b", "metadata": { "execution": { "iopub.execute_input": "2026-05-12T09:52:27.885187Z", "iopub.status.busy": "2026-05-12T09:52:27.885102Z", "iopub.status.idle": "2026-05-12T09:52:27.886784Z", "shell.execute_reply": "2026-05-12T09:52:27.886480Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "to_features: ['Petal.Width <= 0.800000011920929']\n" ] } ], "source": [ "print(\"to_features:\", explainer.to_features(sufficient_reason))" ] }, { "cell_type": "markdown", "id": "337056c0", "metadata": {}, "source": [ "The ```to_features``` method eliminates redundant features by default and is also able to return more information about the features using the ```details``` parameter. This method is described in the [concepts](/documentation/explainer/concepts/) page. " ] }, { "cell_type": "markdown", "id": "d524a051", "metadata": {}, "source": [ "We can check whether the derived explanation actually is a reason." ] }, { "cell_type": "code", "execution_count": 10, "id": "21a1bd99", "metadata": { "execution": { "iopub.execute_input": "2026-05-12T09:52:27.887662Z", "iopub.status.busy": "2026-05-12T09:52:27.887573Z", "iopub.status.idle": "2026-05-12T09:52:27.889264Z", "shell.execute_reply": "2026-05-12T09:52:27.888949Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "is sufficient: True\n" ] } ], "source": [ "print(\"is sufficient: \", explainer.is_sufficient_reason(sufficient_reason))" ] }, { "cell_type": "markdown", "id": "43404046", "metadata": {}, "source": [ "{: .attention }\n", "\n", "> It is important to note that computing and checking reasons are done independently." ] }, { "cell_type": "markdown", "id": "28f5daf6", "metadata": {}, "source": [ "To conclude, the sufficient reason (```('Petal.Width <= 0.8',)```) explains why the instance ```[5.0 3.3 1.4 0.2]``` is well classified by the model (the prediction was Iris-setosa). It is because the fourth feature (the petal width in cm), set to 0.2 cm, is less than or equal to 0.8 cm (see the attached image). \n", "\"Iris\"" ] }, { "cell_type": "markdown", "id": "82f1f716", "metadata": {}, "source": [ "### Contrastive explanations" ] }, { "cell_type": "markdown", "id": "25efc49d", "metadata": {}, "source": [ "Now, let us consider another instance, a wrongly classified one using the parameter ```is_correct=False``` of the function ```get_instances```. We set this instance to the explainer with the ```set_instance``` method." ] }, { "cell_type": "code", "execution_count": 11, "id": "c3887aae", "metadata": { "execution": { "iopub.execute_input": "2026-05-12T09:52:27.890178Z", "iopub.status.busy": "2026-05-12T09:52:27.890093Z", "iopub.status.idle": "2026-05-12T09:52:27.966256Z", "shell.execute_reply": "2026-05-12T09:52:27.965978Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "--------------- Instances ----------------\n", "Number of instances selected: 1\n", "----------------------------------------------\n", "The prediction: Iris-virginica\n" ] } ], "source": [ "instance, prediction = learner.get_instances(model, n=1, is_correct=False, seed=2)\n", "explainer.set_instance(instance)\n", "print(\"The prediction: \", explainer.target_prediction)" ] }, { "cell_type": "markdown", "id": "9e38acce", "metadata": {}, "source": [ "We can explain why this instance is **not** classified differently by providing a contrastive explanation." ] }, { "cell_type": "code", "execution_count": 12, "id": "6c0ad185", "metadata": { "execution": { "iopub.execute_input": "2026-05-12T09:52:27.967638Z", "iopub.status.busy": "2026-05-12T09:52:27.967545Z", "iopub.status.idle": "2026-05-12T09:52:27.969555Z", "shell.execute_reply": "2026-05-12T09:52:27.969285Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "contrastive reason (1,)\n", "to_features: ['Petal.Width > 0.800000011920929']\n" ] } ], "source": [ "contrastive_reason = explainer.contrastive_reason()\n", "print(\"contrastive reason\", contrastive_reason)\n", "print(\"to_features:\", explainer.to_features(contrastive_reason, contrastive=True))" ] }, { "cell_type": "code", "execution_count": 13, "id": "e1f1f793-b4aa-40b1-9514-ec73de165e48", "metadata": { "execution": { "iopub.execute_input": "2026-05-12T09:52:27.970745Z", "iopub.status.busy": "2026-05-12T09:52:27.970655Z", "iopub.status.idle": "2026-05-12T09:52:27.972568Z", "shell.execute_reply": "2026-05-12T09:52:27.972300Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Sepal.Length 6.0\n", "Sepal.Width 2.7\n", "Petal.Length 5.1\n", "Petal.Width 1.6\n", "Name: 83, dtype: float64\n" ] } ], "source": [ "print(instance)" ] }, { "cell_type": "code", "execution_count": 14, "id": "df9ec991-06a9-4ec0-9746-88850d7ab0b7", "metadata": { "execution": { "iopub.execute_input": "2026-05-12T09:52:27.973721Z", "iopub.status.busy": "2026-05-12T09:52:27.973638Z", "iopub.status.idle": "2026-05-12T09:52:27.975792Z", "shell.execute_reply": "2026-05-12T09:52:27.975515Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Sepal.Length 6.0\n", "Sepal.Width 2.7\n", "Petal.Length 5.1\n", "Petal.Width 0.5\n", "Name: 83, dtype: float64\n", "Prediction: Iris-setosa\n" ] } ], "source": [ "# By changing the feature Petal.Width to less than or equal to 0.8 we obtain a different classification\n", "instance['Petal.Width'] = 0.5\n", "print(instance)\n", "explainer.set_instance(instance)\n", "print(\"Prediction: \", explainer.target_prediction)" ] }, { "cell_type": "markdown", "id": "d1b16200", "metadata": {}, "source": [ "More information about explanations can be found in the [Explainer Principles](/documentation/explainer/) page, the [Explaining Classification](/documentation/classification/) page and the [Explaining Regression](/documentation/regression/) page." ] } ], "metadata": { "celltoolbar": "Pièces jointes", "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.7" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 5 }