{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "bc80c219",
   "metadata": {},
   "source": [
    "# Saving/Loading Models"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": "The PyXAI library provides functions to save and load models and related hyper-parameters, as well as preselected instances. PyXAI can save several models from an experimental protocol in a directory chosen by the user (named ```<save_directory>``` in this example). Each model is associated with an index ```<i>``` and two files:\n\n* ```<save_directory>/<dataset>.<i>.map```: JSON file containing training and test indexes, accuracy, solver name, etc.\n* ```<save_directory>/<dataset>.<i>.pkl```: Raw model saved as a Pickle file.\n\nYou can also save preselected instances, which requires an additional file:\n* ```<save_directory>/<dataset>.<i>.instances``` (optional): JSON file containing the indexes of preselected instances.\n"
  },
  {
   "cell_type": "markdown",
   "id": "89054ae8",
   "metadata": {},
   "source": [
    "## Saving Models"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": "As an illustration, we use the ```compas``` dataset. We start by training two Random Forests using a leave-one-group-out cross-validation protocol and selecting one instance:"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "data:\n",
      "      Number_of_Priors  score_factor  Age_Above_FourtyFive   \n",
      "0                    0             0                     1  \\\n",
      "1                    0             0                     0   \n",
      "2                    4             0                     0   \n",
      "3                    0             0                     0   \n",
      "4                   14             1                     0   \n",
      "...                ...           ...                   ...   \n",
      "6167                 0             1                     0   \n",
      "6168                 0             0                     0   \n",
      "6169                 0             0                     1   \n",
      "6170                 3             0                     0   \n",
      "6171                 2             0                     0   \n",
      "\n",
      "      Age_Below_TwentyFive  African_American  Asian  Hispanic   \n",
      "0                        0                 0      0         0  \\\n",
      "1                        0                 1      0         0   \n",
      "2                        1                 1      0         0   \n",
      "3                        0                 0      0         0   \n",
      "4                        0                 0      0         0   \n",
      "...                    ...               ...    ...       ...   \n",
      "6167                     1                 1      0         0   \n",
      "6168                     1                 1      0         0   \n",
      "6169                     0                 0      0         0   \n",
      "6170                     0                 1      0         0   \n",
      "6171                     1                 0      0         1   \n",
      "\n",
      "      Native_American  Other  Female  Misdemeanor  Two_yr_Recidivism  \n",
      "0                   0      1       0            0                  0  \n",
      "1                   0      0       0            0                  1  \n",
      "2                   0      0       0            0                  1  \n",
      "3                   0      1       0            1                  0  \n",
      "4                   0      0       0            0                  1  \n",
      "...               ...    ...     ...          ...                ...  \n",
      "6167                0      0       0            0                  0  \n",
      "6168                0      0       0            0                  0  \n",
      "6169                0      1       0            0                  0  \n",
      "6170                0      0       1            1                  0  \n",
      "6171                0      0       1            0                  1  \n",
      "\n",
      "[6172 rows x 12 columns]\n",
      "--------------   Information   ---------------\n",
      "Dataset name: ../dataset/compas.csv\n",
      "nFeatures (nAttributes, with the labels): 12\n",
      "nInstances (nObservations): 6172\n",
      "nLabels: 2\n",
      "---------------   Evaluation   ---------------\n",
      "method: LeaveOneGroupOut\n",
      "output: RF\n",
      "learner_type: Classification\n",
      "learner_options: {'max_depth': None, 'random_state': 0}\n",
      "---------   Evaluation Information   ---------\n",
      "For the evaluation number 0:\n",
      "metrics:\n",
      "   accuracy: 66.42903434867142\n",
      "nTraining instances: 3086\n",
      "nTest instances: 3086\n",
      "\n",
      "For the evaluation number 1:\n",
      "metrics:\n",
      "   accuracy: 64.45236552171096\n",
      "nTraining instances: 3086\n",
      "nTest instances: 3086\n",
      "\n",
      "---------------   Explainer   ----------------\n",
      "For the evaluation number 0:\n",
      "**Random Forest Model**\n",
      "nClasses: 2\n",
      "nTrees: 100\n",
      "nVariables: 63\n",
      "\n",
      "For the evaluation number 1:\n",
      "**Random Forest Model**\n",
      "nClasses: 2\n",
      "nTrees: 100\n",
      "nVariables: 69\n",
      "\n",
      "---------------   Instances   ----------------\n",
      "number of instances selected: 1\n",
      "----------------------------------------------\n"
     ]
    }
   ],
   "source": "from pyxai import Learning, Explainer, Tools\n\nlearner = Learning.Scikitlearn(\"../dataset/compas.csv\", problem_type=Learning.CLASSIFICATION)\nmodels = learner.evaluate(splitting_method=Learning.LEAVE_ONE_GROUP_OUT, model_type=Learning.RF, splitting_parameters={'n_models': 2})\ninstance, prediction = learner.get_instances(n=1)"
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": "The ```save``` method of ```ModelIO``` allows saving the models:"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Model saved: (try_save/compas.0.model, try_save/compas.0.map)\n",
      "Model saved: (try_save/compas.1.model, try_save/compas.1.map)\n"
     ]
    }
   ],
   "source": "Learning.ModelIO.save(models, \"try_save\")"
  },
  {
   "cell_type": "markdown",
   "id": "d82b9196",
   "metadata": {},
   "source": [
    "{: .warning }\n",
    "> If models based on the same dataset already exist in this folder, the method overwrites them."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9041d85f",
   "metadata": {},
   "source": [
    "## Loading Models"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": "After saving the models, you can reload them in another program using ```load```:"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "----------   Loading Information   -----------\n",
      "mapping file: try_save/compas.0.map\n",
      "nFeatures (nAttributes, with the labels): 12\n",
      "nInstances (nObservations): 6172\n",
      "nLabels: 2\n",
      "----------   Loading Information   -----------\n",
      "mapping file: try_save/compas.1.map\n",
      "nFeatures (nAttributes, with the labels): 12\n",
      "nInstances (nObservations): 6172\n",
      "nLabels: 2\n",
      "---------   Evaluation Information   ---------\n",
      "For the evaluation number 0:\n",
      "metrics: {'accuracy': 66.42903434867142}\n",
      "nTraining instances: 3086\n",
      "nTest instances: 3086\n",
      "\n",
      "For the evaluation number 1:\n",
      "metrics: {'accuracy': 64.45236552171096}\n",
      "nTraining instances: 3086\n",
      "nTest instances: 3086\n",
      "\n",
      "---------------   Explainer   ----------------\n",
      "For the evaluation number 0:\n",
      "**Random Forest Model**\n",
      "nClasses: 2\n",
      "nTrees: 100\n",
      "nVariables: 63\n",
      "\n",
      "For the evaluation number 1:\n",
      "**Random Forest Model**\n",
      "nClasses: 2\n",
      "nTrees: 100\n",
      "nVariables: 69\n",
      "\n",
      "sufficient_reason: (-1, -2, -3, -4, 5, -6, -9, -11, -13)\n",
      "sufficient_reason: (-1, -2, -3, -4, -6, 8, -13)\n"
     ]
    }
   ],
   "source": "from pyxai import Learning, Explainer\n\nlearner, models = Learning.ModelIO.load(\"try_save\")\n\nfor model in models:\n    explainer = Explainer.initialize(model, instance)\n    print(\"sufficient_reason:\", explainer.sufficient_reason())"
  },
  {
   "cell_type": "markdown",
   "id": "57a62bac",
   "metadata": {},
   "source": [
    "## Saving/Loading Instances"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": "PyXAI also allows saving and loading instances. To this end, we use the ```get_instances``` method with the ```save_directory``` and ```instances_id``` parameters."
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": "To save instances (more precisely, their indexes), use the ```save_directory``` and ```instances_id``` parameters. To reload them, use the ```indexes``` and ```instances_id``` parameters."
  },
  {
   "cell_type": "markdown",
   "id": "3ae1f652",
   "metadata": {},
   "source": [
    "In this example, for each of the two models, the indexes of 10 instances of the test set are save into the ```try_save``` directory:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "885c8fa0",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "---------------   Instances   ----------------\n",
      "Indexes of selected instances saved in: try_save/compas.0.instances\n",
      "number of instances selected: 10\n",
      "----------------------------------------------\n",
      "---------------   Instances   ----------------\n",
      "Indexes of selected instances saved in: try_save/compas.1.instances\n",
      "number of instances selected: 10\n",
      "----------------------------------------------\n"
     ]
    }
   ],
   "source": [
    "for id, model in enumerate(models):\n",
    "    instances = learner.get_instances(\n",
    "      dataset=\"../dataset/compas.csv\",\n",
    "      indexes=Learning.TEST, \n",
    "      n=10, \n",
    "      model=model, \n",
    "      save_directory=\"try_save\",\n",
    "      instances_id=id)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "460bbad2",
   "metadata": {},
   "source": [
    "{: .attention }\n",
    "> If the dataset has never been loaded, get_instances does not load it completely and reads only the necessary indexes in the dataset."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "312d0f71",
   "metadata": {},
   "source": [
    "Later, in another program, you can load the same instances using these instructions:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "31c6dc91",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "---------------   Instances   ----------------\n",
      "Loading instances file: try_save/compas.0.instances\n",
      "number of instances selected: 10\n",
      "----------------------------------------------\n",
      "---------------   Instances   ----------------\n",
      "Loading instances file: try_save/compas.1.instances\n",
      "number of instances selected: 10\n",
      "----------------------------------------------\n"
     ]
    }
   ],
   "source": [
    "for id, model in enumerate(models):\n",
    "    instances = learner.get_instances(\n",
    "      dataset=\"../dataset/compas.csv\",\n",
    "      indexes=\"try_save\", \n",
    "      model=model, \n",
    "      instances_id=id)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": "More information about the ```get_instances``` method is available on the [Generating Models](/documentation/learning/generating/) page."
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}