SklearnClassifierPipeline#

class SklearnClassifierPipeline(classifier, transformers)[source]#

Pipeline of transformers and a classifier.

The SklearnClassifierPipeline chains transformers and an single classifier.

Similar to ClassifierPipeline, but uses a tabular sklearn classifier.

The pipeline is constructed with a list of sktime transformers, plus a classifier,

i.e., transformers following the BaseTransformer interface, classifier follows the scikit-learn classifier interface.

The transformer list can be unnamed - a simple list of transformers -

or string named - a list of pairs of string, estimator.

For a list of transformers trafo1, trafo2, …, trafoN and a classifier clf,

the pipeline behaves as follows:

fit(X, y) - changes styte by running trafo1.fit_transform on X,

them trafo2.fit_transform on the output of trafo1.fit_transform, etc sequentially, with trafo[i] receiving the output of trafo[i-1], and then running clf.fit with X the output of trafo[N] converted to numpy, and y identical with the input to self.fit. X is converted to numpyflat mtype if X is of Panel scitype; X is converted to numpy2D mtype if X is of Table scitype.

predict(X) - result is of executing trafo1.transform, trafo2.transform, etc

with trafo[i].transform input = output of trafo[i-1].transform, then running clf.predict on the numpy converted output of trafoN.transform, and returning the output of clf.predict. Output of trasfoN.transform is converted to numpy, as in fit.

predict_proba(X) - result is of executing trafo1.transform, trafo2.transform,

etc, with trafo[i].transform input = output of trafo[i-1].transform, then running clf.predict_proba on the output of trafoN.transform, and returning the output of clf.predict_proba. Output of trasfoN.transform is converted to numpy, as in fit.

get_params, set_params uses sklearn compatible nesting interface

if list is unnamed, names are generated as names of classes if names are non-unique, f”_{str(i)}” is appended to each name string

SklearnClassifierPipeline can also be created by using the magic multiplication

between sktime transformers and sklearn classifiers,: and my_trafo1, my_trafo2 inherit from BaseTransformer, then, for instance, my_trafo1 * my_trafo2 * my_clf will result in the same object as obtained from the constructor SklearnClassifierPipeline(classifier=my_clf, transformers=[t1, t2])
magic multiplication can also be used with (str, transformer) pairs,: as long as one element in the chain is a transformer

Parameters

classifiersklearn classifier, i.e., inheriting from sklearn ClassifierMixin: this is a “blueprint” classifier, state does not change when fit is called
transformerslist of sktime transformers, or: list of tuples (str, transformer) of sktime transformers these are “blueprint” transformers, states do not change when fit is called

Attributes

classifier_sklearn classifier, clone of classifier in classifier: this clone is fitted in the pipeline when fit is called
transformers_list of tuples (str, transformer) of sktime transformers: clones of transformers in transformers which are fitted in the pipeline is always in (str, transformer) format, even if transformers is just a list strings not passed in transformers are unique generated strings i-th transformer in transformers_ is clone of i-th in transformers

Examples

>>> from sklearn.neighbors import KNeighborsClassifier
>>> from sktime.transformations.series.exponent import ExponentTransformer
>>> from sktime.transformations.series.summarize import SummaryTransformer
>>> from sktime.datasets import load_unit_test
>>> from sktime.classification.compose import SklearnClassifierPipeline
>>> X_train, y_train = load_unit_test(split="train")
>>> X_test, y_test = load_unit_test(split="test")
>>> t1 = ExponentTransformer()
>>> t2 = SummaryTransformer()
>>> pipeline = SklearnClassifierPipeline(KNeighborsClassifier(), [t1, t2])
>>> pipeline = pipeline.fit(X_train, y_train)
>>> y_pred = pipeline.predict(X_test)

Alternative construction via dunder method: >>> pipeline = t1 * t2 * KNeighborsClassifier()

Methods

`check_is_fitted`()	Check if the estimator has been fitted.
`clone`()	Obtain a clone of the object with same hyper-parameters.
`clone_tags`(estimator[, tag_names])	clone/mirror tags from another estimator as dynamic override.
`create_test_instance`([parameter_set])	Construct Estimator instance if possible.
`create_test_instances_and_names`([parameter_set])	Create list of all test instances and a list of names for them.
`fit`(X, y)	Fit time series classifier to training data.
`fit_predict`(X, y[, cv, change_state])	Fit and predict labels for sequences in X.
`fit_predict_proba`(X, y[, cv, change_state])	Fit and predict labels probabilities for sequences in X.
`get_class_tag`(tag_name[, tag_value_default])	Get tag value from estimator class (only class tags).
`get_class_tags`()	Get class tags from estimator class and all its parent classes.
`get_fitted_params`()	Get fitted parameters.
`get_param_defaults`()	Get parameter defaults for the object.
`get_param_names`()	Get parameter names for the object.
`get_params`([deep])	Get parameters of estimator in transformers.
`get_tag`(tag_name[, tag_value_default, …])	Get tag value from estimator class and dynamic tag overrides.
`get_tags`()	Get tags from estimator class and dynamic tag overrides.
`get_test_params`([parameter_set])	Return testing parameter settings for the estimator.
`is_composite`()	Check if the object is composite.
`load_from_path`(serial)	Load object from file location.
`load_from_serial`(serial)	Load object from serialized memory container.
`predict`(X)	Predicts labels for sequences in X.
`predict_proba`(X)	Predicts labels probabilities for sequences in X.
`reset`()	Reset the object to a clean post-init state.
`save`([path])	Save serialized self to bytes-like object or to (.zip) file.
`score`(X, y)	Scores predicted labels against ground truth labels on X.
`set_params`(**kwargs)	Set the parameters of estimator in transformers.
`set_tags`(**tag_dict)	Set dynamic tags to given values.

clone()[source]#

Obtain a clone of the object with same hyper-parameters.

A clone is a different object without shared references, in post-init state. This function is equivalent to returning sklearn.clone of self. Equal in value to type(self)(**self.get_params(deep=False)).

Returns

instance of type(self), clone of self (see above)

get_params(deep=True)[source]#

Get parameters of estimator in transformers.

Parameters

deepboolean, optional, default=True: If True, will return the parameters for this estimator and contained sub-objects that are estimators.

Returns

paramsmapping of string to any: Parameter names mapped to their values.

check_is_fitted()[source]#

Check if the estimator has been fitted.

Raises

NotFittedError: If the estimator has not been fitted yet.

clone_tags(estimator, tag_names=None)[source]#

clone/mirror tags from another estimator as dynamic override.

Parameters

estimatorestimator inheriting from :class:BaseEstimator
tag_namesstr or list of str, default = None: Names of tags to clone. If None then all tags in estimator are used as tag_names.

Returns

Self: Reference to self.

Notes

Changes object state by setting tag values in tag_set from estimator as dynamic tags in self.

classmethod create_test_instance(parameter_set='default')[source]#

Construct Estimator instance if possible.

Parameters

parameter_setstr, default=”default”: Name of the set of test parameters to return, for use in tests. If no special parameters are defined for a value, will return “default” set.

Returns

instanceinstance of the class with default parameters

Notes

get_test_params can return dict or list of dict. This function takes first or single dict that get_test_params returns, and constructs the object with that.

classmethod create_test_instances_and_names(parameter_set='default')[source]#

Create list of all test instances and a list of names for them.

Parameters

parameter_setstr, default=”default”: Name of the set of test parameters to return, for use in tests. If no special parameters are defined for a value, will return “default” set.

Returns

objslist of instances of cls: i-th instance is cls(**cls.get_test_params()[i])
nameslist of str, same length as objs: i-th element is name of i-th instance of obj in tests convention is {cls.__name__}-{i} if more than one instance otherwise {cls.__name__}
parameter_setstr, default=”default”: Name of the set of test parameters to return, for use in tests. If no special parameters are defined for a value, will return “default” set.

fit(X, y)[source]#

Fit time series classifier to training data.

Parameters

X3D np.array (any number of dimensions, equal length series)

or 2D np.array (univariate, equal length series): of shape [n_instances, series_length]
or pd.DataFrame with each column a dimension, each cell a pd.Series: (any number of dimensions, equal or unequal length series)
or of any other supported Panel mtype: for list of mtypes, see datatypes.SCITYPE_REGISTER for specifications, see examples/AA_datatypes_and_datasets.ipynb

y1D np.array of int, of shape [n_instances] - class labels for fitting

indices correspond to instance indices in X

Returns

selfReference to self.

Notes

Changes state by creating a fitted model that updates attributes ending in “_” and sets is_fitted flag to True.

fit_predict(X, y, cv=None, change_state=True) → numpy.ndarray[source]#

Fit and predict labels for sequences in X.

Convenience method to produce in-sample predictions and cross-validated out-of-sample predictions.

Writes to self, if change_state=True:: Sets self.is_fitted to True. Sets fitted model attributes ending in “_”.

Does not update state if change_state=False.

Parameters

X3D np.array (any number of dimensions, equal length series)

or 2D np.array (univariate, equal length series): of shape [n_instances, series_length]
or pd.DataFrame with each column a dimension, each cell a pd.Series: (any number of dimensions, equal or unequal length series)
or of any other supported Panel mtype: for list of mtypes, see datatypes.SCITYPE_REGISTER for specifications, see examples/AA_datatypes_and_datasets.ipynb

y1D np.array of int, of shape [n_instances] - class labels for fitting

indices correspond to instance indices in X

cvNone, int, or sklearn cross-validation object, optional, default=None

None : predictions are in-sample, equivalent to fit(X, y).predict(X) cv : predictions are equivalent to fit(X_train, y_train).predict(X_test)

intequivalent to cv=KFold(cv, shuffle=True, random_state=x),: i.e., k-fold cross-validation predictions out-of-sample random_state x is taken from self if exists, otherwise x=None

change_statebool, optional (default=True)

if False, will not change the state of the classifier,: i.e., fit/predict sequence is run with a copy, self does not change
if True, will fit self to the full X and y,: end state will be equivalent to running fit(X, y)

Returns

y1D np.array of int, of shape [n_instances] - predicted class labels: indices correspond to instance indices in X if cv is passed, -1 indicates entries not seen in union of test sets

fit_predict_proba(X, y, cv=None, change_state=True) → numpy.ndarray[source]#

Fit and predict labels probabilities for sequences in X.

Convenience method to produce in-sample predictions and cross-validated out-of-sample predictions.

Parameters

X3D np.array (any number of dimensions, equal length series)

or 2D np.array (univariate, equal length series): of shape [n_instances, series_length]
or pd.DataFrame with each column a dimension, each cell a pd.Series: (any number of dimensions, equal or unequal length series)
or of any other supported Panel mtype: for list of mtypes, see datatypes.SCITYPE_REGISTER for specifications, see examples/AA_datatypes_and_datasets.ipynb

y1D np.array of int, of shape [n_instances] - class labels for fitting

indices correspond to instance indices in X

cvNone, int, or sklearn cross-validation object, optional, default=None

None : predictions are in-sample, equivalent to fit(X, y).predict(X) cv : predictions are equivalent to fit(X_train, y_train).predict(X_test)

int : equivalent to cv=Kfold(int), i.e., k-fold cross-validation predictions

change_statebool, optional (default=True)

if False, will not change the state of the classifier,: i.e., fit/predict sequence is run with a copy, self does not change
if True, will fit self to the full X and y,: end state will be equivalent to running fit(X, y)

Returns

y2D array of shape [n_instances, n_classes] - predicted class probabilities: 1st dimension indices correspond to instance indices in X 2nd dimension indices correspond to possible labels (integers) (i, j)-th entry is predictive probability that i-th instance is of class j

classmethod get_class_tag(tag_name, tag_value_default=None)[source]#

Get tag value from estimator class (only class tags).

Parameters

tag_namestr: Name of tag value.
tag_value_defaultany type: Default/fallback value if tag is not found.

Returns

tag_value: Value of the tag_name tag in self. If not found, returns tag_value_default.

classmethod get_class_tags()[source]#

Get class tags from estimator class and all its parent classes.

Returns

collected_tagsdict: Dictionary of tag name : tag value pairs. Collected from _tags class attribute via nested inheritance. NOT overridden by dynamic tags set by set_tags or mirror_tags.

get_fitted_params()[source]#

Get fitted parameters.

State required:: Requires state to be “fitted”.

Returns

fitted_paramsdict of fitted parameters, keys are str names of parameters: parameters of components are indexed as [componentname]__[paramname]

classmethod get_param_defaults()[source]#

Get parameter defaults for the object.

Returns

default_dict: dict with str keys: keys are all parameters of cls that have a default defined in __init__ values are the defaults, as defined in __init__

classmethod get_param_names()[source]#

Get parameter names for the object.

Returns

param_names: list of str, alphabetically sorted list of parameter names of cls

get_tag(tag_name, tag_value_default=None, raise_error=True)[source]#

Get tag value from estimator class and dynamic tag overrides.

Parameters

tag_namestr: Name of tag to be retrieved
tag_value_defaultany type, optional; default=None: Default/fallback value if tag is not found
raise_errorbool: whether a ValueError is raised when the tag is not found

Returns

tag_value: Value of the tag_name tag in self. If not found, returns an error if raise_error is True, otherwise it returns tag_value_default.

Raises

ValueError if raise_error is True i.e. if tag_name is not in self.get_tags(
).keys()

get_tags()[source]#

Get tags from estimator class and dynamic tag overrides.

Returns

collected_tagsdict: Dictionary of tag name : tag value pairs. Collected from _tags class attribute via nested inheritance and then any overrides and new tags from _tags_dynamic object attribute.

is_composite()[source]#

Check if the object is composite.

A composite object is an object which contains objects, as parameters. Called on an instance, since this may differ by instance.

Returns

composite: bool, whether self contains a parameter which is BaseObject

property is_fitted[source]#: Whether fit has been called.

classmethod load_from_path(serial)[source]#

Load object from file location.

Parameters

serialresult of ZipFile(path).open(“object)

Returns

deserialized self resulting in output at path, of cls.save(path)

classmethod load_from_serial(serial)[source]#

Load object from serialized memory container.

Parameters

serial1st element of output of cls.save(None)

Returns

deserialized self resulting in output serial, of cls.save(None)

predict(X) → numpy.ndarray[source]#

Predicts labels for sequences in X.

Parameters

X3D np.array (any number of dimensions, equal length series)

or 2D np.array (univariate, equal length series): of shape [n_instances, series_length]
or pd.DataFrame with each column a dimension, each cell a pd.Series: (any number of dimensions, equal or unequal length series)
or of any other supported Panel mtype: for list of mtypes, see datatypes.SCITYPE_REGISTER for specifications, see examples/AA_datatypes_and_datasets.ipynb

Returns

y1D np.array of int, of shape [n_instances] - predicted class labels: indices correspond to instance indices in X

predict_proba(X) → numpy.ndarray[source]#

Predicts labels probabilities for sequences in X.

Parameters

X3D np.array (any number of dimensions, equal length series)

or 2D np.array (univariate, equal length series): of shape [n_instances, series_length]
or pd.DataFrame with each column a dimension, each cell a pd.Series: (any number of dimensions, equal or unequal length series)
or of any other supported Panel mtype: for list of mtypes, see datatypes.SCITYPE_REGISTER for specifications, see examples/AA_datatypes_and_datasets.ipynb

Returns

y2D array of shape [n_instances, n_classes] - predicted class probabilities: 1st dimension indices correspond to instance indices in X 2nd dimension indices correspond to possible labels (integers) (i, j)-th entry is predictive probability that i-th instance is of class j

reset()[source]#

Reset the object to a clean post-init state.

Equivalent to sklearn.clone but overwrites self. After self.reset() call, self is equal in value to type(self)(**self.get_params(deep=False))

Detail behaviour: removes any object attributes, except:

runs __init__ with current values of hyper-parameters (result of get_params)

Not affected by the reset are: object attributes containing double-underscores class and object methods, class attributes

save(path=None)[source]#

Save serialized self to bytes-like object or to (.zip) file.

Behaviour: if path is None, returns an in-memory serialized self if path is a file location, stores self at that location as a zip file

saved files are zip files with following contents: _metadata - contains class of self, i.e., type(self) _obj - serialized self. This class uses the default serialization (pickle).

Parameters

pathNone or file location (str or Path): if None, self is saved to an in-memory object if file location, self is saved to that file location. If:

path=”estimator” then a zip file estimator.zip will be made at cwd. path=”/home/stored/estimator” then a zip file estimator.zip will be stored in /home/stored/.

Returns

if path is None - in-memory serialized self
if path is file location - ZipFile with reference to the file

score(X, y) → float[source]#

Scores predicted labels against ground truth labels on X.

Parameters

X3D np.array (any number of dimensions, equal length series)

or 2D np.array (univariate, equal length series): of shape [n_instances, series_length]
or pd.DataFrame with each column a dimension, each cell a pd.Series: (any number of dimensions, equal or unequal length series)
or of any other supported Panel mtype: for list of mtypes, see datatypes.SCITYPE_REGISTER for specifications, see examples/AA_datatypes_and_datasets.ipynb

y1D np.ndarray of int, of shape [n_instances] - class labels (ground truth)

indices correspond to instance indices in X

Returns

float, accuracy score of predict(X) vs y

set_params(**kwargs)[source]#

Set the parameters of estimator in transformers.

Valid parameter keys can be listed with get_params().

Returns

selfreturns an instance of self.

set_tags(**tag_dict)[source]#

Set dynamic tags to given values.

Parameters

tag_dictdict: Dictionary of tag name : tag value pairs.

Returns

Self: Reference to self.

Notes

Changes object state by settting tag values in tag_dict as dynamic tags in self.

classmethod get_test_params(parameter_set='default')[source]#

Return testing parameter settings for the estimator.

Parameters

parameter_setstr, default=”default”: Name of the set of test parameters to return, for use in tests. If no special parameters are defined for a value, will return “default” set. For classifiers, a “default” set of parameters should be provided for general testing, and a “results_comparison” set for comparing against previously recorded results if the general set does not produce suitable probabilities to compare against.

Returns

paramsdict or list of dict, default={}: Parameters to create testing instances of the class. Each dict are parameters to construct an “interesting” test instance, i.e., MyClass(**params) or MyClass(**params[i]) creates a valid test instance. create_test_instance uses the first (or only) dictionary in params.