# Rectification for Random Forests

To rectify an random forest, we simply rectify each of its trees. 

## Example from a Real Dataset

For this example, we take the compas.csv dataset. We create a model using the hold-out approach (by default, the test size is set to 30%) and select a miss-classified instance. 

In [1]:
from pyxai import Learning, Explainer

learner = Learning.Scikitlearn("../dataset/compas.csv", learner_type=Learning.CLASSIFICATION)
model = learner.evaluate(method=Learning.HOLD_OUT, output=Learning.RF)

dict_information = learner.get_instances(model, n=1, indexes=Learning.TEST, correct=False, details=True)

instance = dict_information["instance"]
label = dict_information["label"]
prediction = dict_information["prediction"]

print("prediction:", prediction)

data:
      Number_of_Priors  score_factor  Age_Above_FourtyFive  \
0                    0             0                     1   
1                    0             0                     0   
2                    4             0                     0   
3                    0             0                     0   
4                   14             1                     0   
...                ...           ...                   ...   
6167                 0             1                     0   
6168                 0             0                     0   
6169                 0             0                     1   
6170                 3             0                     0   
6171                 2             0                     0   

      Age_Below_TwentyFive  African_American  Asian  Hispanic  \
0                        0                 0      0         0   
1                        0                 1      0         0   
2                        1                 1      0   

We activate the explainer with the associated theory and the selected instance: 

In [2]:
compas_types = {
    "numerical": ["Number_of_Priors"],
    "binary": ["Misdemeanor", "score_factor", "Female"],
    "categorical": {"{African_American,Asian,Hispanic,Native_American,Other}": ["African_American", "Asian", "Hispanic", "Native_American", "Other"],
                    "Age*": ["Above_FourtyFive", "Below_TwentyFive"]}
}


explainer = Explainer.initialize(model, instance=instance, features_type=compas_types)

---------   Theory Feature Types   -----------
Before the one-hot encoding of categorical features:
Numerical features: 1
Categorical features: 2
Binary features: 3
Number of features: 6
Characteristics of categorical features: {'African_American': ['{African_American,Asian,Hispanic,Native_American,Other}', 'African_American', ['African_American', 'Asian', 'Hispanic', 'Native_American', 'Other']], 'Asian': ['{African_American,Asian,Hispanic,Native_American,Other}', 'Asian', ['African_American', 'Asian', 'Hispanic', 'Native_American', 'Other']], 'Hispanic': ['{African_American,Asian,Hispanic,Native_American,Other}', 'Hispanic', ['African_American', 'Asian', 'Hispanic', 'Native_American', 'Other']], 'Native_American': ['{African_American,Asian,Hispanic,Native_American,Other}', 'Native_American', ['African_American', 'Asian', 'Hispanic', 'Native_American', 'Other']], 'Other': ['{African_American,Asian,Hispanic,Native_American,Other}', 'Other', ['African_American', 'Asian', 'Hispanic', 'Na

We compute why the model predicts 0 for this instance:

In [5]:
reason = explainer.majoritary_reason(n=1)
print("explanation:", reason)
print("to_features:", explainer.to_features(reason))

explanation: (-2, -3, -6, 9)
to_features: ('Number_of_Priors <= 0.5', 'score_factor = 0', 'Age != Below_TwentyFive', 'Misdemeanor = 1')


Suppose that the user knows that every instance covered by the explanation (-2, -3, -6, 9) should be classified as a positive instance. The model must be rectified by the corresponding classification rule.
Once the model has been corrected, the instance is classified as expected by the user:

In [6]:
model = explainer.rectify(conditions=reason, label=1)        
print("new prediction:", model.predict_instance(instance))


-------------- Rectification information:
Classification Rule - Number of nodes: 9
Model - Number of nodes: 89814
Model - Number of nodes (after rectification): 290854
Model - Number of nodes (after simplification using the theory): 93768
Model - Number of nodes (after elimination of redundant nodes): 60176
--------------
new prediction: 1
