metasklearn.core package

metasklearn.core.problem module

class metasklearn.core.problem.HyperparameterProblem(bounds=None, minmax='max', X=None, y=None, estimator=None, metric_class=None, obj_name=None, sklearn_score=None, cv=None, n_jobs=None, shuffle=True, seed=None, **kwargs)[source]

Bases: Problem

A class to define a hyperparameter optimization problem for machine learning models.

Inherits from the Problem class in the mealpy library and provides functionality to evaluate hyperparameter configurations using cross-validation.

estimator: The machine learning model to optimize.

X: The feature matrix.

y: The target vector.

metric_class: A custom metric class for evaluation.

obj_name: The name of the objective metric.

cv: The number of cross-validation folds.

n_jobs: The number of parallel jobs for cross-validation.

shuffle: Whether to shuffle the data before splitting into folds.

kf: The KFold cross-validator instance.

get_obj_score_: The scoring function to use (either sklearn or custom).

obj_func(x)[source]

Objective function to evaluate a hyperparameter configuration.

Parameters:: x – The encoded hyperparameter configuration.
Returns:: The evaluation score for the given configuration.
Return type:: float

metasklearn.core.search module

class metasklearn.core.search.MetaSearchCV(estimator, param_bounds, task_type='classification', optim='BaseGA', optim_params=None, cv=5, scoring=None, seed=None, n_jobs=1, verbose=True, mode='single', n_workers=None, termination=None, **kwargs)[source]

Bases: BaseEstimator

A metaheuristic-powered hyperparameter optimization framework for scikit-learn models.

This class uses metaheuristic optimization algorithms from the Mealpy library to perform hyperparameter tuning for scikit-learn-compatible models via cross-validation.

Parameters:

estimator (BaseEstimator) – The machine learning model to optimize. Must implement scikit-learn’s fit/predict interface.
param_bounds (list, tuple, or dict) – A dictionary specifying the boundary of the hyperparameters to be optimized.
task_type (str, default='classification') – The type of task: ‘classification’ or ‘regression’. Determines the evaluation metric used.
optim (str or Optimizer, default='BaseGA') – The name of the metaheuristic algorithm to use (from Mealpy), or an Optimizer instance.
optim_params (dict, optional) – Dictionary of additional parameters passed to the optimizer (e.g., pop_size, epoch, etc.).
cv (int, default=5) – Number of cross-validation folds.
scoring (str, optional) – Name of the scoring metric. Can be a scikit-learn metric or a custom metric supported by permetrics.
seed (int, optional) – Random seed for reproducibility.
n_jobs (int, default=1) – Number of jobs to run in parallel during cross-validation.
verbose (bool, default=True) – Whether to display logs and progress during optimization.
mode (str, default='single') – Execution mode for the optimizer: ‘single’, ‘swarm’, ‘thread’, or ‘process’.
n_workers (int, optional) – Number of parallel workers used by the optimizer in threaded or multiprocessing mode.
termination (dict, optional) – Dictionary defining custom termination conditions for the optimizer.
**kwargs – Additional keyword arguments passed to the internal problem definition.

best_params

The best hyperparameter configuration found during optimization.

Type:: dict

best_estimator

A clone of the input estimator trained with the best-found parameters.

Type:: BaseEstimator

best_score

The best evaluation score achieved during optimization.

Type:: float

loss_train

A list of best scores over iterations.

Type:: list

problem

Internal representation of the hyperparameter optimization problem.

Type:: HyperparameterProblem

SUPPORTED_CLS_METRICS = {'AS': 'max', 'BSL': 'min', 'CEL': 'min', 'CKS': 'max', 'F1S': 'max', 'F2S': 'max', 'FBS': 'max', 'GINI': 'min', 'GMS': 'max', 'HL': 'min', 'HS': 'max', 'JSI': 'max', 'KLDL': 'min', 'LS': 'max', 'MCC': 'max', 'NPV': 'max', 'PS': 'max', 'ROC-AUC': 'max', 'RS': 'max', 'SS': 'max'}

SUPPORTED_REG_METRICS = {'A10': 'max', 'A20': 'max', 'A30': 'max', 'ACOD': 'max', 'APCC': 'max', 'AR': 'max', 'AR2': 'max', 'CI': 'max', 'COD': 'max', 'COR': 'max', 'COV': 'max', 'CRM': 'min', 'DRV': 'min', 'EC': 'max', 'EVS': 'max', 'GINI': 'min', 'GINI_WIKI': 'min', 'JSD': 'min', 'KGE': 'max', 'MAAPE': 'min', 'MAE': 'min', 'MAPE': 'min', 'MASE': 'min', 'ME': 'min', 'MRB': 'min', 'MRE': 'min', 'MSE': 'min', 'MSLE': 'min', 'MedAE': 'min', 'NNSE': 'max', 'NRMSE': 'min', 'NSE': 'max', 'OI': 'max', 'PCC': 'max', 'PCD': 'max', 'R': 'max', 'R2': 'max', 'R2S': 'max', 'RAE': 'min', 'RMSE': 'min', 'RSE': 'min', 'RSQ': 'max', 'SMAPE': 'min', 'VAF': 'max', 'WI': 'max'}

evaluate(y_true, y_pred, list_metrics=('AS', 'RS'))[source]

Evaluates the model’s predictions using the specified metrics.

Parameters:

y_true – The ground truth target values.
y_pred – The predicted target values.
list_metrics – A list of metric names to evaluate.

Returns:

A dictionary of metric names and their corresponding values.

Return type:

dict

fit(X, y)[source]

Fits the model using the provided data and performs hyperparameter optimization.

Parameters:

X – The feature matrix.
y – The target vector.

Returns:

The fitted instance.

Return type:

MetaSearchCV

static load_model(load_path='history', filename='network.pkl')[source]

Load a saved model from a pickle file.

Parameters:

load_path (str, default="history") – Directory containing the saved file.
filename (str, default="network.pkl") – Name of the file (must end with .pkl).

Returns:

model – Loaded model instance.

Return type:

BaseRVFL

predict(X)[source]

Predicts the target values for the given feature matrix.

Parameters:: X – The feature matrix.
Returns:: The predicted target values.
Return type:: np.ndarray
Raises:: ValueError – If the model is not trained.

save_convergence(save_path='history', filename='convergence.csv')[source]

Save the convergence (fitness value) during the training process to csv file.

Parameters:

save_path (saved path (relative path, consider from current executed script path)) –
filename (name of the file, needs to have ".csv" extension) –

save_model(save_path='history', filename='network.pkl')[source]

Save network to pickle file

Parameters:

save_path (saved path (relative path, consider from current executed script path)) –
filename (name of the file, needs to have ".pkl" extension) –

save_performance_metrics(y_true, y_pred, list_metrics=('RMSE', 'MAE'), save_path='history', filename='metrics.csv')[source]

Save evaluation metrics to csv file

Parameters:

y_true (ground truth data) –
y_pred (predicted output) –
list_metrics (list of evaluation metrics) –
save_path (saved path (relative path, consider from current executed script path)) –
filename (name of the file, needs to have ".csv" extension) –

save_y_predicted(X, y_true, save_path='history', filename='y_predicted.csv')[source]

Save the predicted results to csv file

Parameters:

X (The features data, nd.ndarray) –
y_true (The ground truth data) –
save_path (saved path (relative path, consider from current executed script path)) –
filename (name of the file, needs to have ".csv" extension) –

score(X, y)[source]

Computes the score of the model on the given data.

Parameters:

X – The feature matrix.
y – The target vector.

Returns:

The score of the model.

Return type:

float

Raises:

ValueError – If the model is not trained.

scores(X, y, list_metrics=('AS', 'RS'))[source]

Computes evaluation metrics for the model’s predictions.

Parameters:

X – The feature matrix.
y – The target vector.
list_metrics – A list of metric names to evaluate.

Returns:

A dictionary of metric names and their corresponding values.

Return type:

dict