-
Notifications
You must be signed in to change notification settings - Fork 300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Models] Return and store models parameters during forecast and CV #639
Comments
Hey @vspinu, thanks for using statsforecast. The forecast method is designed to be more memory efficient by returning only the forecasted values. If you're interested in seeing the models attributes you should use fit + predict. For CV it's the same case, it's designed to just return the forecasts in order to evaluate the models performance. If you want the attributes you can also compute the splits manually and run fit + predict for each fold. I'll take a look at what we can do to allow you to save the fitting and forecasting times. |
Thanks @jmoralez. Fit+predict is surely an option but it would require fitting the models twice. Also there are some implementation differences between fit+predict and forecast (ex. progress bar, fallback model). I wonder if some consistent abstraction of parameters is warranted more generally. Is there currently a way to fit, say AutoETS, retrieve and store the parameters without storing the AutoETS object itself, and finally recreate the AutoETS from the parameters? |
Why would you need to fit the models twice? In your use case you said you wanted to inspect the parameters of the fitted models, this only requires fitting once. About restoring a model, the parameters vary a lot between the different models, we currently don't have a consistent way to save/retrieve these, but I think it's something we could have on the roadmap. Depending on how you're using the library you currently have a couple of options:
from statsforecast import StatsForecast
from statsforecast.models import AutoETS
from statsforecast.utils import AirPassengersDF
# first fit finds the best model type
sf = StatsForecast(models=[AutoETS(season_length=12)], freq='D')
sf.fit(df=AirPassengersDF)
# fitted_ is of shape n_series, n_models
learned_model = sf.fitted_[0, 0].model_['components']
single_ets = AutoETS(season_length=12, model=learned_model[:3], damped=learned_model[3] != 'N')
single_ets.fit(AirPassengersDF['y'].values)
forecasts = single_ets.predict(h=12, level=[80])
from statsforecast import StatsForecast
from statsforecast.ets import ets_f, forecast_ets
from statsforecast.models import AutoETS
from statsforecast.utils import AirPassengersDF
# first fit finds the best model type
sf = StatsForecast(models=[AutoETS(season_length=12)], freq='D')
sf.fit(df=AirPassengersDF)
# fitted_ is of shape n_series, n_models
fitted_model = sf.fitted_[0, 0].model_
# use the learned params & state
learned_params = {k: v for k, v in fitted_model.items() if k in ('components', 'par', 'm', 'fit', 'n_params')}
single_ets = ets_f(AirPassengersDF['y'].values, m=12, model=learned_params)
forecasts = forecast_ets(single_ets, h=12, level=[80]) Please let us know if this helps. |
Once for forecasting and once to get the parameters from
It does, but both approaches require dealing with internals to some extent, and are not exactly "user-friendly" or generic. Given the huge number of time series in real-life scenarios one would ideally be able store the parameters in a database and would re-create the models on the fly in prediction or monitoring applications. In any case, not a big deal. Feel free to close this one if not considered of great importance. |
Sorry for the confusion, the overview is:
I agree with you on the second point. We're working towards making deployments easier and more efficient, as a first step we're trying to reduce the dependencies so that the size of the library is smaller (#509, #596, #631). We can address having a way to easily save/load models as a next step. |
Description
Currently the models' fit during forecasting and crossvalidation is lost. Would be nice to have a way to preserve the optimal parameters of the model.
One way to implement this is to make the forecast method return the fitted parameters along other metadata. For example it could be a
meta
slot of theresults
objects except the vector outputs (cols_m
,fitted
,mean
etc).Same
meta
slot could be used for internal metadata, for example time taken for fitting/forecasting per model, a very useful comparison metric which, to the best of my knowledge, is not easy to retrieve in the current setup.Use case
The text was updated successfully, but these errors were encountered: