============ Installation ============ * Install the `current PyPI release `_ .. code-block:: bash $ pip install metasklearn==0.3.0 * Install directly from source code. .. code-block:: bash $ git clone https://github.com/thieu1995/MetaSklearn.git $ cd MetaSklearn $ python setup.py install * In case, you want to install the development version from Github .. code-block:: bash $ pip install git+https://github.com/thieu1995/MetaSklearn After installation, you can check the version of installed MetaSklearn:: $ python >>> import metasklearn >>> metasklearn.__version__ ========= Tutorials ========= In this section, we will explore the usage of the MetaSklearn model with the assistance of a dataset. While all the preprocessing steps mentioned below can be replicated using Scikit-Learn, we have implemented some utility functions to provide users with convenience and faster usage. Provided classes ---------------- Classes that hold Searcher and Dataset .. code-block:: python from metasklearn import DataTransformer, Data from metasklearn import MetaSearchCV `DataTransformer` class ----------------------- We provide many scaler classes that you can select and make a combination of transforming your data via `DataTransformer` class. For example: scale data by `Loge` and then `Sqrt` and then `MinMax`. .. code-block:: python from metasklearn import DataTransformer import pandas as pd from sklearn.model_selection import train_test_split dataset = pd.read_csv('Position_Salaries.csv') X = dataset.iloc[:, 1:5].values y = dataset.iloc[:, 5].values X_train, y_train, X_test, y_test = train_test_split(X, y, test_size=0.2) dt = DataTransformer(scaling_methods=("loge", "sqrt", "minmax")) X_train_scaled = dt.fit_transform(X_train) X_test_scaled = dt.transform(X_test) `Data` class ------------ + You can load your dataset into Data class + You can split dataset to train and test set + You can scale dataset without using DataTransformer class + You can scale labels using LabelEncoder .. code-block:: python from metasklearn import Data import pandas as pd dataset = pd.read_csv('Position_Salaries.csv') X = dataset.iloc[:, 1:5].values y = dataset.iloc[:, 5].values data = Data(X, y, name="position_salaries") #### Split dataset into train and test set data.split_train_test(test_size=0.2, shuffle=True, random_state=100, inplace=True) #### Feature Scaling data.X_train, scaler_X = data.scale(data.X_train, scaling_methods=("standard", "sqrt", "minmax")) data.X_test = scaler_X.transform(data.X_test) data.y_train, scaler_y = data.encode_label(data.y_train) # This is for classification problem only data.y_test = scaler_y.transform(data.y_test) `Searcher` class ---------------- In this example, we will use `MetaSearchCV` to search for the best hyper-parameters of the SVM model. .. code-block:: python from sklearn.svm import SVC from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from metasklearn import MetaSearchCV, FloatVar, StringVar # Load dataset X, y = load_iris(return_X_y=True) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) # Define param bounds for SVC # param_bounds = { ==> This is for GridSearchCV, show you how to convert to our MetaSearchCV # "C": [0.1, 100], # "gamma": [1e-4, 1], # "kernel": ["linear", "rbf", "poly"] # } param_bounds = [ FloatVar(lb=0., ub=100., name="C"), FloatVar(lb=1e-4, ub=1., name="gamma"), StringVar(valid_sets=("linear", "rbf", "poly"), name="kernel") ] # Initialize and fit MetaSearchCV searcher = MetaSearchCV( estimator=SVC(), param_bounds=param_bounds, task_type="classification", optim="BaseGA", # Using Genetic Algorithm for hyper-parameter optimization optim_params={"epoch": 20, "pop_size": 30, "name": "GA"}, cv=3, scoring="AS", # or any custom scoring like "F1_macro" seed=42, n_jobs=2, verbose=True, mode='single', n_workers=None, termination=None ) searcher.fit(X_train, y_train) print("Best parameters (Classification):", searcher.best_params) print("Best model: ", searcher.best_estimator) print("Best score during searching: ", searcher.best_score) # Make prediction after re-fit y_pred = searcher.predict(X_test) print("Test Accuracy:", searcher.score(X_test, y_test)) print("Test Score: ", searcher.scores(X_test, y_test, list_metrics=("AS", "RS", "PS", "F1S"))) Please check out the examples for more details on how to use the `MetaSearchCV` class. .. toctree:: :maxdepth: 4 .. toctree:: :maxdepth: 4 .. toctree:: :maxdepth: 4