YfromX#
- YfromX(estimator, pooling='local')[source]#
Simple reduction predicting endogeneous from concurrent exogeneous variables.
Tabulates all seen
Xandyby time index and applies tabular supervised regression.In
fit, given endogeneous time seriesyand exogeneousX: fitsestimatorto feature-label pairs as defined as follows.features = \(y(t)\), labels: \(X(t)\) ranging over all \(t\) where the above have been observed (are in the index)
In
predict, at a time \(t\) in the forecasting horizon, usesestimatorto predict \(y(t)\), from labels: \(X(t)\)If regressor is
skproprobabilistic regressor, and haspredict_intervaletc, usesestimatorto predict \(y(t)\), from labels: \(X(t)\), passing on thepredict_intervaletc arguments.If no exogeneous data is provided, will predict the mean of
yseen infit.In order to use a fit not on the entire historical data and update periodically, combine this with
UpdateRefitsEvery.In order to deal with missing data, combine this with
Imputer.To construct an custom direct reducer, combine with
YtoX,Lag, orReducerTransform.- Parameters:
- estimatorsklearn regressor or skpro probabilistic regressor,
must be compatible with sklearn or skpro interface tabular regression algorithm used in reduction algorithm if skpro regressor, resulting forecaster will have probabilistic capability
- poolingstr, one of [“local”, “global”, “panel”], optional, default=”local”
level on which data are pooled to fit the supervised regression model “local” = unit/instance level, one reduced model per lowest hierarchy level “global” = top level, one reduced model overall, on pooled data ignoring levels “panel” = second lowest level, one reduced model per panel level (-2) if there are 2 or less levels, “global” and “panel” result in the same if there is only 1 level (single time series), all three settings agree
Examples
>>> from sktime.datasets import load_longley >>> from sktime.split import temporal_train_test_split >>> from sktime.forecasting.compose import YfromX >>> from sklearn.linear_model import LinearRegression >>> >>> y, X = load_longley() >>> y_train, y_test, X_train, X_test = temporal_train_test_split(y, X) >>> fh = y_test.index >>> >>> f = YfromX(LinearRegression()) >>> f.fit(y=y_train, X=X_train, fh=fh) YfromX(...) >>> y_pred = f.predict(X=X_test)
YfromX can also be used with skpro probabilistic regressors, in this case the resulting forecaster will be capable of probabilistic forecasts: >>> from skpro.regression.residual import ResidualDouble # doctest: +SKIP >>> reg_proba = ResidualDouble(LinearRegression()) # doctest: +SKIP >>> f = YfromX(reg_proba) # doctest: +SKIP >>> f.fit(y=y_train, X=X_train, fh=fh) # doctest: +SKIP YfromX(…) >>> y_pred = f.predict_interval(X=X_test) # doctest: +SKIP