Pipeline Regression With GASearchCV

This notebook shows how to tune a scikit-learn Pipeline with GASearchCV. The objective is still pipeline prediction, but the example now includes a stronger regression workflow, holdout metrics, optimizer telemetry, and advanced optimizer controls.

Problem Setup

We use the diabetes regression dataset and tune a pipeline containing StandardScaler and GradientBoostingRegressor. Pipeline parameters use the usual sklearn double-underscore syntax, such as regressor__max_depth.

[1]:

import warnings
from pprint import pprint

import pandas as pd
from sklearn.datasets import load_diabetes
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
from sklearn.model_selection import KFold, train_test_split
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler

from sklearn_genetic import (
    EvolutionConfig,
    GASearchCV,
    OptimizationConfig,
    PopulationConfig,
    RuntimeConfig,
)
from sklearn_genetic.callbacks import ConsecutiveStopping, DeltaThreshold, TimerStopping
from sklearn_genetic.plots import plot_fitness_evolution, plot_search_space
from sklearn_genetic.schedules import ExponentialAdapter, InverseAdapter
from sklearn_genetic.space import Categorical, Continuous, Integer

warnings.filterwarnings("ignore", category=UserWarning)

RANDOM_STATE = 42

[2]:

data = load_diabetes(as_frame=True)
X = data.data
y = data.target

X_train, X_test, y_train, y_test = train_test_split(
    X,
    y,
    test_size=0.30,
    random_state=RANDOM_STATE,
)

cv = KFold(n_splits=4, shuffle=True, random_state=RANDOM_STATE)

print(f"Training shape: {X_train.shape}")
print(f"Test shape: {X_test.shape}")

Training shape: (309, 10)
Test shape: (133, 10)

Baseline Pipeline

A baseline gives us a sanity check before optimizing. The helper below returns common regression metrics where lower RMSE/MAE is better and higher R2 is better.

[3]:

def make_pipeline(**regressor_kwargs):
    return Pipeline(
        [
            ("scaler", StandardScaler()),
            (
                "regressor",
                GradientBoostingRegressor(random_state=RANDOM_STATE, **regressor_kwargs),
            ),
        ]
    )


def regression_metrics(estimator, X_eval, y_eval):
    predictions = estimator.predict(X_eval)
    rmse = mean_squared_error(y_eval, predictions) ** 0.5
    return {
        "r2": r2_score(y_eval, predictions),
        "rmse": rmse,
        "mae": mean_absolute_error(y_eval, predictions),
    }


baseline = make_pipeline()
baseline.fit(X_train, y_train)
baseline_metrics = regression_metrics(baseline, X_test, y_test)
baseline_metrics

[3]:

{'r2': 0.43031868253825245,
 'rmse': 55.45552342062193,
 'mae': 44.71796061792019}

Define Pipeline Search Space

GASearchCV receives the same parameter names you would use with sklearn grid search. The values are sklearn-genetic-opt space objects instead of fixed grids or scipy distributions.

[4]:

param_grid = {
    "regressor__n_estimators": Integer(40, 180),
    "regressor__learning_rate": Continuous(0.01, 0.20, distribution="log-uniform"),
    "regressor__max_depth": Integer(1, 4),
    "regressor__min_samples_leaf": Integer(1, 12),
    "regressor__subsample": Continuous(0.65, 1.0),
    "regressor__loss": Categorical(["squared_error", "absolute_error", "huber"]),
}

Configure GASearchCV

This search uses performance and quality controls: smart initialization, warm starts, adaptive schedules, diversity control, fitness sharing, local refinement, cache reuse, and automatic parallel backend selection.

[5]:

search = GASearchCV(
    estimator=make_pipeline(),
    param_grid=param_grid,
    scoring="neg_root_mean_squared_error",
    criteria="max",
    cv=cv,
    evolution_config=EvolutionConfig(
        population_size=12,
        generations=10,
        crossover_probability=ExponentialAdapter(initial_value=0.8, end_value=0.4, adaptive_rate=0.15),
        mutation_probability=InverseAdapter(initial_value=0.25, end_value=0.08, adaptive_rate=0.25),
        tournament_size=3,
        elitism=True,
        keep_top_k=3,
    ),
    population_config=PopulationConfig(
        initializer="smart",
        warm_start_configs=[
            {
                "regressor__n_estimators": 100,
                "regressor__learning_rate": 0.05,
                "regressor__max_depth": 2,
                "regressor__min_samples_leaf": 4,
                "regressor__subsample": 0.85,
                "regressor__loss": "squared_error",
            }
        ],
    ),
    runtime_config=RuntimeConfig(
        n_jobs=-1,
        parallel_backend="auto",
        use_cache=True,
        verbose=True,
        return_train_score=False,
    ),
    optimization_config=OptimizationConfig(
        local_search=True,
        local_search_top_k=2,
        local_search_steps=1,
        local_search_radius=0.20,
        diversity_control=True,
        diversity_threshold=0.30,
        diversity_stagnation_generations=3,
        diversity_mutation_boost=1.8,
        random_immigrants_fraction=0.10,
        fitness_sharing=True,
        sharing_radius=0.40,
    ),
)

callbacks = [
    DeltaThreshold(threshold=0.01, generations=5, metric="fitness_best"),
    ConsecutiveStopping(generations=7, metric="fitness_best"),
    TimerStopping(total_seconds=120),
]

search.fit(X_train, y_train, callbacks=callbacks)

 gen evals           avg          best     div  unique  stag     mut   sel             events
---- ----- ------------- ------------- ------- ------- ----- ------- ----- ------------------
   0    12     -61.14987     -59.23358   0.682   1.000     0       -     - -
   1    24     -60.65867     -59.23358   0.288   0.667     1   0.200     3 dup=3,share
   2    24     -60.71806     -58.92132   0.364   0.750     0   0.256     3 div,imm=3,dup=10,s
   3    24     -59.74359     -58.92132   0.303   0.667     1   0.193     3 dup=5,share
   4    24     -59.91321     -58.92132   0.333   0.667     2   0.177     3 dup=10,share
   5    24     -62.38514     -58.92132   0.409   0.667     3   0.165     3 dup=13,share
INFO: TimerStopping callback met its criteria
INFO: Stopping the algorithm

[5]:

GASearchCV(crossover_probability=<sklearn_genetic.schedules.schedulers.ExponentialAdapter object at 0x000002ACC9B397F0>,
           cv=KFold(n_splits=4, random_state=42, shuffle=True),
           diversity_control=True, diversity_mutation_boost=1.8,
           diversity_stagnation_generations=3, diversity_threshold=0.3,
           estimator=Pipeline(steps=[('scaler', StandardScaler()),
                                     ('regressor',
                                      Gradi...
                                                                   'regressor__min_samples_leaf': 4,
                                                                   'regressor__n_estimators': 100,
                                                                   'regressor__subsample': 0.85}]),
           population_size=12, return_train_score=True,
           runtime_config=RuntimeConfig(n_jobs=-1,
                                        pre_dispatch='2*n_jobs',
                                        error_score=nan,
                                        return_train_score=False,
                                        use_cache=True,
                                        parallel_backend='auto',
                                        verbose=True),
           scoring='neg_root_mean_squared_error', sharing_radius=0.4)

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

GASearchCV

iFitted

Parameters

	estimator	Pipeline(step...9516060816))])
	cv	KFold(n_split... shuffle=True)
	param_grid	{'regressor__learning_rate': <sklearn_gene...002ACC9B39550>, 'regressor__loss': <sklearn_gene...002ACC9B39400>, 'regressor__max_depth': <sklearn_gene...002ACC9AEE350>, 'regressor__min_samples_leaf': <sklearn_gene...002ACC9AEE210>, ...}
	scoring	'neg_root_mean_squared_error'
	population_size	12
	generations	10
	crossover_probability	<sklearn_gene...002ACC9B397F0>
	mutation_probability	<sklearn_gene...002ACC9B396A0>
	keep_top_k	3
	n_jobs	-1
	return_train_score	True
	evolution_config	EvolutionConf...MuPlusLambda')
	population_config	PopulationCon...ared_error'}])
	runtime_config	RuntimeConfig... verbose=True)
	optimization_config	OptimizationC...ction_cv=None)
	local_search	True
	local_search_top_k	2
	local_search_radius	0.2
	diversity_control	True
	diversity_threshold	0.3
	diversity_stagnation_generations	3
	diversity_mutation_boost	1.8
	fitness_sharing	True
	sharing_radius	0.4
	tournament_size	3
	elitism	True
	verbose	True
	criteria	'max'
	algorithm	'eaMuPlusLambda'
	refit	True
	pre_dispatch	'2*n_jobs'
	error_score	nan
	log_config	None
	use_cache	True
	warm_start_configs	None
	parallel_backend	'auto'
	population_initializer	'smart'
	local_search_steps	1
	random_immigrants_fraction	0.1
	adaptive_selection	False
	selection_pressure_min	2
	selection_pressure_max	None
	offspring_diversity_retries	0
	sharing_alpha	1.0
	final_selection	False
	final_selection_top_k	3
	final_selection_cv	None

Fitted attributes

Name	Type	Value
X_	DataFrame	age... x 10 columns]
best_estimator_	Pipeline	Pipeline(step...9516060816))])
best_index_	int	133
best_params_	dict	{'re...te': 0.0875444183193989, 're...ss': 'sq...or', 're...th': 1, 're...af': 9, ...}
best_score_	float	-58.82
cv_results_	dict	{'me...me': [np.float64(0.8632683157920837), np.float64(0.8658415079116821), np.float64(0.9247797131538391), np.float64(1.5119764804840088), ...], 'me...me': [np.float64(0....7496337890625), np.float64(0....9754371643066), np.float64(0....4149227142334), np.float64(0....6505546569824), ...], 'me...re': [np.float64(-60.31850154975417), np.float64(-6...4046828555245), np.float64(-5...3582652202905), np.float64(-60.21950347099971), ...], 'me...re': [np.float64(-4...3099979723964), np.float64(-2...0023997839722), np.float64(-5...0289346405715), np.float64(-5...7671444606044), ...], ...}
estimator_	Pipeline	Pipeline(step...9516060816))])
final_selection_results_	dict	{'ca...es': [], 'changed': False, 'cv': None, 'enabled': False, ...}
fit_stats_	dict	{'ca...ts': 1, 'cr...ls': 133, 'du...es': 0, 'ev...es': 134, ...}
multimetric_	bool	False
n_features_in_	int	10
n_splits_	int	4
refit_time_	float	0.0728
scorer_	_Scorer	make_scorer(r...hod='predict')
y_	Series[float64](309,)	225 208.0 ...dtype: float64

best_estimator_: Pipeline

StandardScaler

?Documentation for StandardScaler

Parameters

	copy copy: bool, default=True If False, try to avoid a copy and do inplace scaling instead. This is not guaranteed to always work inplace; e.g. if the data is not a NumPy array or scipy.sparse CSR matrix, a copy may still be returned.	True
	with_mean with_mean: bool, default=True If True, center the data before scaling. This does not work (and will raise an exception) when attempted on sparse matrices, because centering them entails building a dense matrix which in common use cases is likely to be too large to fit in memory.	True
	with_std with_std: bool, default=True If True, scale the data to unit variance (or equivalently, unit standard deviation).	True

Fitted attributes

Name	Type	Value
feature_names_in_ feature_names_in_: ndarray of shape (`n_features_in_`,) Names of features seen during :term:`fit`. Defined only when `X` has feature names that are all strings. .. versionadded:: 1.0	ndarray[object](10,)	['age','sex','bmi',...,'s4','s5','s6']
mean_ mean_: ndarray of shape (n_features,) or None The mean value for each feature in the training set. Equal to ``None`` when ``with_mean=False`` and ``with_std=False``.	ndarray[float64](10,)	[0.,0.,0.,...,0.,0.,0.]
n_features_in_ n_features_in_: int Number of features seen during :term:`fit`. .. versionadded:: 0.24	int	10
n_samples_seen_ n_samples_seen_: int or ndarray of shape (n_features,) The number of samples processed by the estimator for each feature. If there are no missing samples, the ``n_samples_seen`` will be an integer, otherwise it will be an array of dtype int. If `sample_weights` are used it will be a float (if no missing data) or an array of dtype float that sums the weights seen so far. Will be reset on new calls to fit, but increments across ``partial_fit`` calls.	float64	309
scale_ scale_: ndarray of shape (n_features,) or None Per feature relative scaling of the data to achieve zero mean and unit variance. Generally this is calculated using `np.sqrt(var_)`. If a variance is zero, we can't achieve unit variance, and the data is left as-is, giving a scaling factor of 1. `scale_` is equal to `None` when `with_std=False`. .. versionadded:: 0.17 scale_	ndarray[float64](10,)	[0.05,0.05,0.05,...,0.05,0.05,0.05]
var_ var_: ndarray of shape (n_features,) or None The variance for each feature in the training set. Used to compute `scale_`. Equal to ``None`` when ``with_mean=False`` and ``with_std=False``.	ndarray[float64](10,)	[0.,0.,0.,...,0.,0.,0.]

10 features

age

sex

bmi

bp

s1

s2

s3

s4

s5

s6

GradientBoostingRegressor

?Documentation for GradientBoostingRegressor

Parameters

	learning_rate learning_rate: float, default=0.1 Learning rate shrinks the contribution of each tree by `learning_rate`. There is a trade-off between learning_rate and n_estimators. Values must be in the range `[0.0, inf)`.	0.0875444183193989
	n_estimators n_estimators: int, default=100 The number of boosting stages to perform. Gradient boosting is fairly robust to over-fitting so a large number usually results in better performance. Values must be in the range `[1, inf)`.	91
	subsample subsample: float, default=1.0 The fraction of samples to be used for fitting the individual base learners. If smaller than 1.0 this results in Stochastic Gradient Boosting. `subsample` interacts with the parameter `n_estimators`. Choosing `subsample < 1.0` leads to a reduction of variance and an increase in bias. Values must be in the range `(0.0, 1.0]`.	0.7276569516060816
	min_samples_leaf min_samples_leaf: int or float, default=1 The minimum number of samples required to be at a leaf node. A split point at any depth will only be considered if it leaves at least ``min_samples_leaf`` training samples in each of the left and right branches. This may have the effect of smoothing the model, especially in regression. - If int, values must be in the range `[1, inf)`. - If float, values must be in the range `(0.0, 1.0)` and `min_samples_leaf` will be `ceil(min_samples_leaf * n_samples)`. .. versionchanged:: 0.18 Added float values for fractions.	9
	max_depth max_depth: int or None, default=3 Maximum depth of the individual regression estimators. The maximum depth limits the number of nodes in the tree. Tune this parameter for best performance; the best value depends on the interaction of the input variables. If None, then nodes are expanded until all leaves are pure or until all leaves contain less than min_samples_split samples. If int, values must be in the range `[1, inf)`.	1
	random_state random_state: int, RandomState instance or None, default=None Controls the random seed given to each Tree estimator at each boosting iteration. In addition, it controls the random permutation of the features at each split (see Notes for more details). It also controls the random splitting of the training data to obtain a validation set if `n_iter_no_change` is not None. Pass an int for reproducible output across multiple function calls. See :term:`Glossary <random_state>`.	42
	loss loss: {'squared_error', 'absolute_error', 'huber', 'quantile'}, default='squared_error' Loss function to be optimized. 'squared_error' refers to the squared error for regression. 'absolute_error' refers to the absolute error of regression and is a robust loss function. 'huber' is a combination of the two. 'quantile' allows quantile regression (use `alpha` to specify the quantile). See :ref:`sphx_glr_auto_examples_ensemble_plot_gradient_boosting_quantile.py` for an example that demonstrates quantile regression for creating prediction intervals with `loss='quantile'`.	'squared_error'
	criterion criterion: {'friedman_mse', 'squared_error'}, default='friedman_mse' This parameter has no effect. .. versionadded:: 0.18 .. deprecated:: 1.9 `criterion` is deprecated and will be removed in 1.11.	'deprecated'
	min_samples_split min_samples_split: int or float, default=2 The minimum number of samples required to split an internal node: - If int, values must be in the range `[2, inf)`. - If float, values must be in the range `(0.0, 1.0]` and `min_samples_split` will be `ceil(min_samples_split * n_samples)`. .. versionchanged:: 0.18 Added float values for fractions.	2
	min_weight_fraction_leaf min_weight_fraction_leaf: float, default=0.0 The minimum weighted fraction of the sum total of weights (of all the input samples) required to be at a leaf node. Samples have equal weight when sample_weight is not provided. Values must be in the range `[0.0, 0.5]`.	0.0
	min_impurity_decrease min_impurity_decrease: float, default=0.0 A node will be split if this split induces a decrease of the impurity greater than or equal to this value. Values must be in the range `[0.0, inf)`. The weighted impurity decrease equation is the following:: N_t / N * (impurity - N_t_R / N_t * right_impurity - N_t_L / N_t * left_impurity) where ``N`` is the total number of samples, ``N_t`` is the number of samples at the current node, ``N_t_L`` is the number of samples in the left child, and ``N_t_R`` is the number of samples in the right child. ``N``, ``N_t``, ``N_t_R`` and ``N_t_L`` all refer to the weighted sum, if ``sample_weight`` is passed. .. versionadded:: 0.19	0.0
	init init: estimator or 'zero', default=None An estimator object that is used to compute the initial predictions. ``init`` has to provide :term:`fit` and :term:`predict`. If 'zero', the initial raw predictions are set to zero. By default a ``DummyEstimator`` is used, predicting either the average target value (for loss='squared_error'), or a quantile for the other losses.	None
	max_features max_features: {'sqrt', 'log2'}, int or float, default=None The number of features to consider when looking for the best split: - If int, values must be in the range `[1, inf)`. - If float, values must be in the range `(0.0, 1.0]` and the features considered at each split will be `max(1, int(max_features * n_features_in_))`. - If "sqrt", then `max_features=sqrt(n_features)`. - If "log2", then `max_features=log2(n_features)`. - If None, then `max_features=n_features`. Choosing `max_features < n_features` leads to a reduction of variance and an increase in bias. Note: the search for a split does not stop until at least one valid partition of the node samples is found, even if it requires to effectively inspect more than ``max_features`` features.	None
	alpha alpha: float, default=0.9 The alpha-quantile of the huber loss function and the quantile loss function. Only if ``loss='huber'`` or ``loss='quantile'``. Values must be in the range `(0.0, 1.0)`.	0.9
	verbose verbose: int, default=0 Enable verbose output. If 1 then it prints progress and performance once in a while (the more trees the lower the frequency). If greater than 1 then it prints progress and performance for every tree. Values must be in the range `[0, inf)`.	0
	max_leaf_nodes max_leaf_nodes: int, default=None Grow trees with ``max_leaf_nodes`` in best-first fashion. Best nodes are defined as relative reduction in impurity. Values must be in the range `[2, inf)`. If None, then unlimited number of leaf nodes.	None
	warm_start warm_start: bool, default=False When set to ``True``, reuse the solution of the previous call to fit and add more estimators to the ensemble, otherwise, just erase the previous solution. See :term:`the Glossary <warm_start>`.	False
	validation_fraction validation_fraction: float, default=0.1 The proportion of training data to set aside as validation set for early stopping. Values must be in the range `(0.0, 1.0)`. Only used if ``n_iter_no_change`` is set to an integer. .. versionadded:: 0.20	0.1
	n_iter_no_change n_iter_no_change: int, default=None ``n_iter_no_change`` is used to decide if early stopping will be used to terminate training when validation score is not improving. By default it is set to None to disable early stopping. If set to a number, it will set aside ``validation_fraction`` size of the training data as validation and terminate training when validation score is not improving in all of the previous ``n_iter_no_change`` numbers of iterations. Values must be in the range `[1, inf)`. See :ref:`sphx_glr_auto_examples_ensemble_plot_gradient_boosting_early_stopping.py`. .. versionadded:: 0.20	None
	tol tol: float, default=1e-4 Tolerance for the early stopping. When the loss is not improving by at least tol for ``n_iter_no_change`` iterations (if set to a number), the training stops. Values must be in the range `[0.0, inf)`. .. versionadded:: 0.20	0.0001
	ccp_alpha ccp_alpha: non-negative float, default=0.0 Complexity parameter used for Minimal Cost-Complexity Pruning. The subtree with the largest cost complexity that is smaller than ``ccp_alpha`` will be chosen. By default, no pruning is performed. Values must be in the range `[0.0, inf)`. See :ref:`minimal_cost_complexity_pruning` for details. See :ref:`sphx_glr_auto_examples_tree_plot_cost_complexity_pruning.py` for an example of such pruning. .. versionadded:: 0.22	0.0

Fitted attributes

Name	Type	Value
estimators_ estimators_: ndarray of DecisionTreeRegressor of shape (n_estimators, 1) The collection of fitted sub-estimators.	ndarray[object](91, 1)	[[DecisionTreeRegressor(max_depth=1, min_samples_leaf=9, random_state=RandomState(MT19937) at 0x2ACC9A97940)], [DecisionTreeRegressor(max_depth=1, min_samples_leaf=9, random_state=RandomState(MT19937) at 0x2ACC9A97940)], [DecisionTreeRegressor(max_depth=1, min_samples_leaf=9, random_state=RandomState(MT19937) at 0x2ACC9A97940)], ..., [DecisionTreeRegressor(max_depth=1, min_samples_leaf=9, random_state=RandomState(MT19937) at 0x2ACC9A97940)], [DecisionTreeRegressor(max_depth=1, min_samples_leaf=9, random_state=RandomState(MT19937) at 0x2ACC9A97940)], [DecisionTreeRegressor(max_depth=1, min_samples_leaf=9, random_state=RandomState(MT19937) at 0x2ACC9A97940)]]
feature_importances_ feature_importances_: ndarray of shape (n_features,) The impurity-based feature importances. The higher, the more important the feature. The importance of a feature is computed as the (normalized) total reduction of the MSE brought by that feature. It is also known as the Gini importance. Warning: impurity-based feature importances can be misleading for high cardinality features (many unique values). See :func:`sklearn.inspection.permutation_importance` as an alternative.	ndarray[float64](10,)	[0.01,0.02,0.44,...,0.01,0.27,0.03]
init_ init_: estimator The estimator that provides the initial predictions. Set via the ``init`` argument.	DummyRegressor	DummyRegressor()
max_features_ max_features_: int The inferred value of max_features.	int	10
n_estimators_ n_estimators_: int The number of estimators as selected by early stopping (if ``n_iter_no_change`` is specified). Otherwise it is set to ``n_estimators``.	int	91
n_features_in_ n_features_in_: int Number of features seen during :term:`fit`. .. versionadded:: 0.24	int	10
n_trees_per_iteration_ n_trees_per_iteration_: int The number of trees that are built at each iteration. For regressors, this is always 1. .. versionadded:: 1.4.0	int	1
oob_improvement_ oob_improvement_: ndarray of shape (n_estimators,) The improvement in loss on the out-of-bag samples relative to the previous iteration. ``oob_improvement_[0]`` is the improvement in loss of the first stage over the ``init`` estimator. Only available if ``subsample < 1.0``.	ndarray[float64](91,)	[ 144.17, -875.67, -814.59,..., 771.33,-1014.01, 121.24]
oob_score_ oob_score_: float The last value of the loss on the out-of-bag samples. It is the same as `oob_scores_[-1]`. Only available if `subsample < 1.0`. .. versionadded:: 1.3	float64	2950
oob_scores_ oob_scores_: ndarray of shape (n_estimators,) The full history of the loss values on the out-of-bag samples. Only available if `subsample < 1.0`. .. versionadded:: 1.3	ndarray[float64](91,)	[5039.61,5915.28,6729.87,...,2057.01,3071.02,2949.78]
train_score_ train_score_: ndarray of shape (n_estimators,) The i-th score ``train_score_[i]`` is the loss of the model at iteration ``i`` on the in-bag sample. If ``subsample == 1`` this is the loss on the training data.	ndarray[float64](91,)	[6152.26,5465.45,4819.29,...,2747.9 ,2356.48,2398.55]

Evaluate Predictions

GASearchCV refits the best pipeline, so you can call predict directly on the search object.

[6]:

print("Best CV negative RMSE:", round(search.best_score_, 4))
print("Best parameters:")
pprint(search.best_params_)

ga_metrics = regression_metrics(search, X_test, y_test)
pd.DataFrame([baseline_metrics, ga_metrics], index=["baseline", "ga_pipeline"])

Best CV negative RMSE: -58.8192
Best parameters:
{'regressor__learning_rate': 0.0875444183193989,
 'regressor__loss': 'squared_error',
 'regressor__max_depth': 1,
 'regressor__min_samples_leaf': 9,
 'regressor__n_estimators': 91,
 'regressor__subsample': 0.7276569516060816}

[6]:

	r2	rmse	mae
baseline	0.430319	55.455523	44.717961
ga_pipeline	0.498775	52.017011	41.412206

Inspect Search Cost and Telemetry

fit_stats_ summarizes evaluation mechanics. history stores generation-level telemetry, including diversity and stagnation fields.

[7]:

search.fit_stats_

[7]:

{'evaluated_candidates': 134,
 'unique_candidates': 133,
 'cross_validate_calls': 133,
 'cache_hits': 1,
 'duplicate_candidates': 0,
 'skipped_invalid_candidates': 0,
 'population_parallel_batches': 7,
 'population_serial_batches': 0,
 'random_immigrants': 3,
 'local_refinement_candidates': 2}

[8]:

history = pd.DataFrame(search.history)
telemetry_columns = [
    "gen",
    "fitness",
    "fitness_max",
    "fitness_std",
    "unique_individual_ratio",
    "genotype_diversity",
    "stagnation_generations",
    "best_generation",
]
history[[column for column in telemetry_columns if column in history.columns]].tail()

[8]:

	gen	fitness	fitness_max	fitness_std	unique_individual_ratio	genotype_diversity	stagnation_generations	best_generation
1	1	-60.658669	-59.955816	0.461462	0.666667	0.287879	1	0
2	2	-60.718057	-58.921317	0.785379	0.750000	0.363636	0	2
3	3	-59.743591	-59.106876	0.587545	0.666667	0.303030	1	2
4	4	-59.913208	-59.106876	0.596168	0.666667	0.333333	2	2
5	5	-61.326108	-58.819157	1.757950	0.750000	0.409091	0	5

Visualize the Search

The plotting helpers work directly with fitted search objects. Use them for quick inspection, then rely on history and cv_results_ when you need custom reporting.

[9]:

plot_fitness_evolution(search)

[9]:

<Axes: title={'center': 'Best fitness so far'}, xlabel='generations', ylabel='fitness (score)'>

../_images/notebooks_Pipeline_prediction_16_1.png

[10]:

plot_search_space(search, features=["regressor__learning_rate", "regressor__max_depth"])
plot_search_space(search, features=["regressor__learning_rate", "regressor__max_depth"])

[10]:

<seaborn.axisgrid.PairGrid at 0x2accadc3770>

../_images/notebooks_Pipeline_prediction_17_1.png

Practical Notes

Use pipeline parameter names exactly as sklearn expects them.
For regression losses where larger is better only after negation, use sklearn’s negative scorers such as neg_root_mean_squared_error.
Compare holdout metrics, not only CV fitness.
If the search revisits many candidates, inspect cache_hits and consider stronger diversity controls or a larger search space.

Pipeline Regression With GASearchCV

Menu

Problem Setup

Baseline Pipeline

Define Pipeline Search Space

Configure GASearchCV

Evaluate Predictions

Inspect Search Cost and Telemetry

Visualize the Search

Practical Notes