Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not getting what I expect from a Cross Validation using monthly data and cutoffs #2553

Open
gvas7 opened this issue Feb 15, 2024 · 2 comments

Comments

@gvas7
Copy link

gvas7 commented Feb 15, 2024

Alot of the data I use is only available on monthly schedules, and my forecasts are only important by month (or quarter) so I have to work within those parameters. My data is monthly data starting from 2015-01 and ending in 2023-12.

I build a model in the following manner and get a forecast:

model = Prophet(seasonality_mode='multiplicative')
model.fit(df)
future = model.make_future_dataframe(periods = 12*2, freq='MS')
forecast = model.predict(future)

I wanted to try to perform cross validation by month, but am having trouble getting the result I'd like since I have to use cutoffs. I created cutoffs in monthly starts (my data uses monthly starts as the date), so I write the following:

cutoffs = pd.date_range(start='2019-01-01', end = '2022-12-01', freq='MS')

This gives me what I would expect:

DatetimeIndex(['2019-01-01', '2019-02-01', '2019-03-01', '2019-04-01',
               '2019-05-01', ...
               '2022-09-01', '2022-10-01', '2022-11-01', '2022-12-01'],
              dtype='datetime64[ns]', freq='MS')

I setup my cross validation like this since I can't use a monthly freq (since months are not constant per other comments I've read):

df_cv = cross_validation(model=model, horizon='365 days', cutoffs=cutoffs)

My intention with the cross validation with the cutoff is:

  • Use 2015-01 to 2018-12 to train
  • try to predict ~1 year out along each month so:
    Train 2015-01 to 2018-12, predict 2019-01 to 2019-12, then train 2015-01 to 2019-01, predict 2019-02 to 2020-01, etc so my last is something like 2023-01 to 2012-12 forecasted
  • Build a table for errors and compare.

But when I print the frame for cross validation, I get weird horizons that are sometimes a day apart:
image

How do I get what I want in terms of CV if am forced to use monthly data? Thanks!

@priamai
Copy link

priamai commented May 12, 2024

I have exactly the same problem!

@asadwecr
Copy link

Hi I face the same issue. Any resolution on this? Many thanks in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants