Logo Icon

Benchmarking Time Series Models

Note: This post is NOT financial advice! This is just a fun way to explore some of the capabilities R has for importing and manipulating data.

This is a quick post on the importance of benchmarking time-series forecasts. First we need to reload the functions from my last few posts on times-series cross-validation. (I copied the relevant code at the bottom of this post so you don’t have to find it).

devtools::install_github('zachmayer/cv.ts')

Next, we need to load data for the S&P 500. To simplify things, and allow us to explore seasonality effects, I’m going to load monthly data, back to 1980.

#Setup
set.seed(1)
library(quantmod)
library(forecast)

#Load data
getSymbols('^GSPC', from='1980-01-01')
#> [1] "GSPC"

#Simplify to monthly level
GSPC <- to.monthly(GSPC)
Data <- as.ts(Cl(GSPC), start=1980, end=c(2023, 12))

The object “Data” has monthly closing prices for the S&P 500 back until 1980. Next, we cross validate 3 time series forecasting models: auto.arima, from the forecast package, a mean forecast, that returns the mean value over the last year, and a naive forecast, which assumes the next value of the series will be equal to the present value. These last 2 forecasts serve as benchmarks, to help determine if auto.arima would be useful for forecasting the S&P 500. Also note that I’m using BIC as a criteria for selecting arima models, and I have trace on so you can see the results of the model selection process.

library(cv.ts)

#Setup model cross-validation
myControl <- tseriesControl(
  minObs=12,
  stepSize=1,
  maxHorizon=12,
  fixedWindow=TRUE,
  preProcess=FALSE,
  summaryFunc=tsSummary
)

#Cross validate 3 models (model 3 is SLOW!)
model1 <- cv.ts(Data, meanForecast, myControl)
model2 <- cv.ts(Data, naiveForecast, myControl)
model3 <- cv.ts(Data, auto.arimaForecast, myControl, ic='bic', trace=TRUE)

#Find the RMSE for each model and create a matrix with our results
models <- list(model1, model2, model3)
models <- lapply(models, function(x) x[1:12,'RMSE'])
results <- do.call(cbind,models)
colnames(results) <- c('mean','naive','ar')

#Order by average RMSE for the 1st 3 months
results <- t(results)
order <- rowMeans(results[,1:3])
results <- results[order(order),]
print(results)

After the 3 models finish cross-validating, it is useful to plot their forecast errors at different horizons. As you can see, auto.arima performs much better than the mean model, but is constantly worse than the naive model. This illustrates the importance of benchmarking forecasts. If you can’t constantly beat a naive forecast, there’s no reason to waste processing power on a useless model.

Finally, note that you can parallelize the cv.ts function by loading your favorite foreach backend.

stay in touch