Class LearnerInformation
This class contains some information about a learner, a dataset and a model.
See the __init__ method for more precision.
def __init__(self,
# About the learner
learner_name=None,
problem_type=None,
model_type=None,
instances_type=None,
labels_type=None,
splitting_method=None,
# About the dataset
dataset_path=None,
n_features=None,
n_labels=None,
feature_names=None,
label_names=None,
instances_directory=None,
labels_directory=None,
get_item_function=None,
# About the model
raw_model=None,
metrics=None,
extras={},
train_index=None,
test_index=None,
groups=None,
): Highlight
Set several information about a model, a dataset and a model.
Parameters
learner_name : str (optional, default=None)
The name of the library used for the evaluation.
problem_type : str | ProblemType (optional, default=None)
The type of problem (classification, regression, …)
Possible values are defined in the ProblemType enum.
model_type : str | ModelType (optional, default=None)
The type of model (linear, tree-based, neural network, …)
Possible values are defined in the ModelType enum.
instances_type : str | InstancesType (optional, default=None)
The type of instances (image, tabular, text, temporal, …)
Possible values are defined in the InstancesType enum.
labels_type : str | LabelsType (optional, default=None)
The type of labels (class, text, mask, contours, …)
Possible values are defined in the LabelsType enum.
splitting_method : str | SplittingMethod (optional, default=None)
The splitting method used for the evalution (hold-out, k-folds, …)
Possible values are defined in the SplittingMethod enum.
dataset_path : str (optional, default=None)
The path and the filename of the dataset.
n_features : int (optional, default=None)
The number of features.
n_labels : int (optional, default=None)
The number of labels.
feature_names : list of str (optional, default=None)
The names of the features used in the model.
label_names : list of str (optional, default=None)
The name of the labels (without redundancy)
instances_directory : str (optional, default=None)
The directory of instances.
labels_directory : str (optional, default=None)
The directory of labels.
get_item_function : Callable (optional, default=None)
A function to get an instance from the dataset. This function is used to get an instance in the right format for the model.
If the dataset is a pandas DataFrame and the instances are tabular, this function is not necessary and can be set to None.
In other cases, this function should be defined by the user. It should take as input a row of the dataframe and return the corresponding instance in the right format for the model.
raw_model : DecisionTreeClassifier | RandomForestClassifier | XGBClassifier | XGBRegressor | LGBMRegressor (optional, default=None)
The raw model of the librairy used to create the model (sklearn, xgboost, lightgmb, …)
metrics : dict (optional, default=None)
A dictionnary containing some metrics about the evaluation (precision, recall, f1_score, specificity, …)
More information are given in the Metrics page.
extras : dict (optional, default={})
Extra information from the raw model (the type, model parameters, base_score, …)
train_index : list of int (optional, default=None)
A list of training indexes used during the evaluation (cross-validation)
test_index : list of int (optional, default=None)
A list of test indexes used during the evaluation (cross-validation)
groups : list of int (optional, default=None)
The used groups for the leave-one-group-out evaluation