Papers Video GitHub In-the-Loop EXPEKCTATION Release Notes About

Importing Models From Libraries

PyXAI can generate models for you. Indeed, it provides dedicated functions that simplify this task. However, if your model has already been trained, you may want to import it into PyXAI in order to extract explanations. This page explains how to perform such a task.

Procedure

Consider the following source code to create a RandomForestClassifier using Scikit-learn:

from sklearn import datasets
from sklearn.ensemble import RandomForestClassifier

model_rf = RandomForestClassifier(random_state=0)
data = datasets.load_breast_cancer(as_frame=True)
X = data.data.to_numpy()
Y = data.target.to_numpy()

feature_names = data.feature_names
model_rf.fit(X, Y);

You can import this ML model using the import_models method of the ModelIO class:

Here is a table summarizing the library compatibility of import_models:

Type	Scikit-learn	Xgboost	LightGBM
Decision Tree	DecisionTreeClassifier
Random Forest	RandomForestClassifier
Boosted Tree		XGBClassifier XGBRegressor	LGBMRegressor

from pyxai import Learning, Explainer

learner, model = Learning.ModelIO.import_models(model_rf, instances_type='tabular')
learner.feature_names = feature_names

Then, you can get explanations by executing:

instance, prediction = learner.get_instances(dataset=data.frame, model=model, n=1)
print("instance:", instance)
print("prediction:", prediction)

---------------   Instances   ----------------
data:
     mean radius  mean texture  mean perimeter  mean area  mean smoothness   
        17.99         10.38          122.80     1001.0          0.11840  \
        20.57         17.77          132.90     1326.0          0.08474   
        19.69         21.25          130.00     1203.0          0.10960   
        11.42         20.38           77.58      386.1          0.14250   
        20.29         14.34          135.10     1297.0          0.10030   
..           ...           ...             ...        ...              ...   
      21.56         22.39          142.00     1479.0          0.11100   
      20.13         28.25          131.20     1261.0          0.09780   
      16.60         28.08          108.30      858.1          0.08455   
      20.60         29.33          140.10     1265.0          0.11780   
       7.76         24.54           47.92      181.0          0.05263   

     mean compactness  mean concavity  mean concave points  mean symmetry   
           0.27760         0.30010              0.14710         0.2419  \
           0.07864         0.08690              0.07017         0.1812   
           0.15990         0.19740              0.12790         0.2069   
           0.28390         0.24140              0.10520         0.2597   
           0.13280         0.19800              0.10430         0.1809   
..                ...             ...                  ...            ...   
         0.11590         0.24390              0.13890         0.1726   
         0.10340         0.14400              0.09791         0.1752   
         0.10230         0.09251              0.05302         0.1590   
         0.27700         0.35140              0.15200         0.2397   
         0.04362         0.00000              0.00000         0.1587   

     mean fractal dimension  ...  worst texture  worst perimeter  worst area   
                 0.07871  ...          17.33           184.60      2019.0  \
                 0.05667  ...          23.41           158.80      1956.0   
                 0.05999  ...          25.53           152.50      1709.0   
                 0.09744  ...          26.50            98.87       567.7   
                 0.05883  ...          16.67           152.20      1575.0   
..                      ...  ...            ...              ...         ...   
               0.05623  ...          26.40           166.10      2027.0   
               0.05533  ...          38.25           155.00      1731.0   
               0.05648  ...          34.12           126.70      1124.0   
               0.07016  ...          39.42           184.60      1821.0   
               0.05884  ...          30.37            59.16       268.6   

     worst smoothness  worst compactness  worst concavity   
           0.16220            0.66560           0.7119  \
           0.12380            0.18660           0.2416   
           0.14440            0.42450           0.4504   
           0.20980            0.86630           0.6869   
           0.13740            0.20500           0.4000   
..                ...                ...              ...   
         0.14100            0.21130           0.4107   
         0.11660            0.19220           0.3215   
         0.11390            0.30940           0.3403   
         0.16500            0.86810           0.9387   
         0.08996            0.06444           0.0000   

     worst concave points  worst symmetry  worst fractal dimension  target  
                0.2654          0.4601                  0.11890       0  
                0.1860          0.2750                  0.08902       0  
                0.2430          0.3613                  0.08758       0  
                0.2575          0.6638                  0.17300       0  
                0.1625          0.2364                  0.07678       0  
..                    ...             ...                      ...     ...  
              0.2216          0.2060                  0.07115       0  
              0.1628          0.2572                  0.06637       0  
              0.1418          0.2218                  0.07820       0  
              0.2650          0.4087                  0.12400       0  
              0.0000          0.2871                  0.07039       1  

[569 rows x 31 columns]
--------------   Information   ---------------
Dataset name: pandas.core.frame.DataFrame
nFeatures (nAttributes, with the labels): 31
nInstances (nObservations): 569
nLabels: 2
number of instances selected: 1
----------------------------------------------
instance: [1.799e+01 1.038e+01 1.228e+02 1.001e+03 1.184e-01 2.776e-01 3.001e-01
471e-01 2.419e-01 7.871e-02 1.095e+00 9.053e-01 8.589e+00 1.534e+02
399e-03 4.904e-02 5.373e-02 1.587e-02 3.003e-02 6.193e-03 2.538e+01
733e+01 1.846e+02 2.019e+03 1.622e-01 6.656e-01 7.119e-01 2.654e-01
601e-01 1.189e-01]
prediction: 0

explainer = Explainer.initialize(model, instance=instance)

direct = explainer.direct_reason()
print("len direct reason:", len(direct))

sufficient = explainer.sufficient_reason()
print("len sufficient reason:", len(sufficient))

print("to_features:", explainer.to_features(sufficient))

len direct reason: 294
len sufficient reason: 159
to_features: ('mean radius > 15.045000076293945', 'mean texture <= 11.585000038146973', 'mean perimeter > 96.57999801635742', 'mean area > 694.5', 'mean smoothness > 0.09075499698519707', 'mean compactness > 0.09524999931454659', 'mean concavity > 0.17409999668598175', 'mean concave points > 0.07939000055193901', 'mean symmetry > 0.12639999762177467', 'radius error > 0.7730999886989594', 'texture error > 0.7377500236034393', 'perimeter error > 2.76200008392334', 'area error > 33.064998626708984', 'smoothness error in ]0.005567499902099371, 0.009928999934345484]', 'compactness error > 0.00834800023585558', 'concavity error in ]0.018459999933838844, 0.2157999947667122]', 'fractal dimension error in ]0.0030724999960511923, 0.012140000239014626]', 'worst radius > 17.72499942779541', 'worst texture in ]15.434999942779541, 18.289999961853027]', 'worst perimeter > 120.70000076293945', 'worst area > 953.7000122070312', 'worst smoothness > 0.1363999992609024', 'worst concavity > 0.4524500072002411', 'worst concave points > 0.16029999405145645', 'worst symmetry > 0.37139999866485596', 'worst fractal dimension > 0.10035499930381775')

Setting learner.feature_names allows the to_features method to display the correct feature names. If not set, the feature names will be of the form f1, f2, f3, …, f30, where the numbers correspond to the rank of the feature in the dataset.

Load/Save From Libraries

The creation of ML models and the calculation of explanations are done by two different programs. You can save them using the first one and load them using the second one.

Scikit-learn

After importing a Scikit-learn model into PyXAI using import_models, you can save it with save and reload it later with load.

from pyxai import Learning
from sklearn import datasets
from sklearn.ensemble import RandomForestClassifier

rf = RandomForestClassifier()
X, Y = datasets.load_breast_cancer(return_X_y=True)
rf.fit(X, Y)

learner, model = Learning.ModelIO.import_models(rf, instances_type='tabular')
Learning.ModelIO.save(model, "my_models/")

You can reload this model in another program using load:

from pyxai import Learning

learner, model = Learning.ModelIO.load("my_models/")

XGBoost

After importing an XGBoost model into PyXAI using import_models, you can save it with save and reload it later with load. See the XGBoost documentation for the native format.

from pyxai import Learning
from sklearn import datasets
from xgboost import XGBClassifier

X, Y = datasets.load_iris(return_X_y=True)
bt = XGBClassifier(eval_metric="mlogloss")
bt.fit(X, Y)

learner, model = Learning.ModelIO.import_models(bt, instances_type='tabular')
Learning.ModelIO.save(model, "my_models/")

You can reload this model in another program using load:

from pyxai import Learning

learner, model = Learning.ModelIO.load("my_models/")

LightGBM

After importing a LightGBM model into PyXAI using import_models, you can save it with save and reload it later with load. See the LightGBM documentation for the native format.

from pyxai import Learning
from sklearn import datasets
import lightgbm

X, Y = datasets.load_iris(return_X_y=True)
lgbm = lightgbm.LGBMRegressor(n_estimators=5, random_state=0)
lgbm.fit(X, Y)

learner, model = Learning.ModelIO.import_models(lgbm, instances_type='tabular')
Learning.ModelIO.save(model, "my_models/")

You can reload this model in another program using load:

from pyxai import Learning

learner, model = Learning.ModelIO.load("my_models/")

Example with cross-validation

This example shows how to import models and compute explanations. We start by implementing a function to process the dataset:

import pandas
import numpy

def load_dataset(dataset):
    data = pandas.read_csv(dataset).copy()

    # extract labels
    labels = data[data.columns[-1]]
    labels = numpy.array(labels)

    # remove the label of each instance
    data = data.drop(columns=[data.columns[-1]])

    # extract the feature names
    feature_names = list(data.columns)

    return data.values, labels, feature_names

Then, we implement a function performing cross-validation. More precisely, we use the Leave One Group Out cross-validator of Scikit-learn and a lightgbm.LGBMRegressor from the LightGBM library:

import functools
import random 
import operator
import lightgbm
from sklearn.model_selection import LeaveOneGroupOut

def cross_validation(X, Y, n_trees=100, n_forests=2) :
    n_instance = len(Y)
    quotient = n_instance // n_forests
    remain = n_instance % n_forests

    # Groups creation
    groups = [quotient*[i] for i in range(1,n_forests+1)]
    groups = functools.reduce(operator.iconcat, groups, [])
    groups += [i for i in range(1,remain+1)]
    random.shuffle(groups)

    # Variable definition
    loo = LeaveOneGroupOut()
    forests = []
    i = 0
    for index_training, index_test in loo.split(X, Y, groups=groups):
        if i < n_forests:
            i += 1
        # Creation of instances (X) and labels (Y) according to the index of loo.split() 
        # for both training and test set
        x_train = [X[x] for x in index_training]
        y_train = [Y[x] for x in index_training]
        x_test = [X[x] for x in index_test]
        y_test = [Y[x] for x in index_test]

        # Training phase
        learner = lightgbm.LGBMRegressor(n_estimators=5, random_state=0)
        learner.fit(x_train, y_train)
        # Get the classifier prediction of the test set  
        y_predict = learner.predict(x_test)

        forests.append((learner, index_training, index_test))
    return forests

Finally, we use the two previous functions and import the models into PyXAI to compute explanations.

from pyxai import Learning, Explainer

data, labels, feature_names = load_dataset("../dataset/winequality-red.csv")
results = cross_validation(data, labels, n_trees=5)

models = [result[0] for result in results]
training_indexes = [result[1] for result in results]
test_indexes = [result[2] for result in results]

learner, models = Learning.ModelIO.import_models(models, instances_type='tabular')

for i, model in enumerate(models):
    instances = learner.get_instances(dataset="../dataset/winequality-red.csv", model=model, n=2, indexes=Learning.TEST, test_indexes=test_indexes[i])

    for (instance, prediction_classifier) in instances:
        explainer = Explainer.initialize(model, instance=instance)
        prediction = model.predict_instance(instance)
        print("prediction:", prediction)
        direct = explainer.direct_reason()
        print("len direct reason:", len(direct))
        explainer.set_interval(prediction - 0.2, prediction + 0.2)
        ts = explainer.tree_specific_reason()
        print("len tree_specific_reason:", len(ts))
        print("---------------------------")

With PyXAI, you can also generate your own models. See the Generating Models page for more information.