YfromX#

YfromX(estimator, pooling='local')[source]#

Simple reduction predicting endogeneous from concurrent exogenous variables.

Tabulates all seen X and y by time index and applies tabular supervised regression.

In fit, given endogeneous time series y and exogenous X: fits estimator to feature-label pairs as defined as follows.

features = \(y(t)\), labels: \(X(t)\) ranging over all \(t\) where the above have been observed (are in the index)

In predict, at a time \(t\) in the forecasting horizon, uses estimator to predict \(y(t)\), from labels: \(X(t)\)

If regressor is skpro probabilistic regressor, and has predict_interval etc, uses estimator to predict \(y(t)\), from labels: \(X(t)\), passing on the predict_interval etc arguments.

If no exogenous data is provided, will predict the mean of y seen in fit.

In order to use a fit not on the entire historical data and update periodically, combine this with UpdateRefitsEvery.

In order to deal with missing data, combine this with Imputer.

To construct an custom direct reducer, combine with YtoX, Lag, or ReducerTransform.

Parameters:

estimatorsklearn regressor or skpro probabilistic regressor,: must be compatible with sklearn or skpro interface tabular regression algorithm used in reduction algorithm if skpro regressor, resulting forecaster will have probabilistic capability
poolingstr, one of [“local”, “global”, “panel”], optional, default=”local”: level on which data are pooled to fit the supervised regression model “local” = unit/instance level, one reduced model per lowest hierarchy level “global” = top level, one reduced model overall, on pooled data ignoring levels “panel” = second lowest level, one reduced model per panel level (-2) if there are 2 or less levels, “global” and “panel” result in the same if there is only 1 level (single time series), all three settings agree

Examples

>>> from sktime.datasets import load_longley
>>> from sktime.split import temporal_train_test_split
>>> from sktime.forecasting.compose import YfromX
>>> from sklearn.linear_model import LinearRegression
>>>
>>> y, X = load_longley()
>>> y_train, y_test, X_train, X_test = temporal_train_test_split(y, X)
>>> fh = y_test.index
>>>
>>> f = YfromX(LinearRegression())
>>> f.fit(y=y_train, X=X_train, fh=fh)
YfromX(...)
>>> y_pred = f.predict(X=X_test)

YfromX can also be used with skpro probabilistic regressors, in this case the resulting forecaster will be capable of probabilistic forecasts: >>> from skpro.regression.residual import ResidualDouble # doctest: +SKIP >>> reg_proba = ResidualDouble(LinearRegression()) # doctest: +SKIP >>> f = YfromX(reg_proba) # doctest: +SKIP >>> f.fit(y=y_train, X=X_train, fh=fh) # doctest: +SKIP YfromX(…) >>> y_pred = f.predict_interval(X=X_test) # doctest: +SKIP