A sample text widget

Etiam pulvinar consectetur dolor sed malesuada. Ut convallis euismod dolor nec pretium. Nunc ut tristique massa.

Nam sodales mi vitae dolor ullamcorper et vulputate enim accumsan. Morbi orci magna, tincidunt vitae molestie nec, molestie at mi. Nulla nulla lorem, suscipit in posuere in, interdum non magna.

Using X12-ARIMA with R

For statisticians or econometricians that sometimes encounter situations where you may need a batch of time series forecasts, a common solution would be to use X12-ARIMA which is a public domain software developed by U.S. Census Bureau mainly for detrending and deseasonalizing your data. It fits your univariate time series to a (seasonal) and filtered ARIMA model where optimality is decided according to certain criteria of model selection. These criteria are similar but slightly different from those found in TRAMO procedure, notably in outlier detection and preference for logarithms. Lots of information on TRAMO could be found here. While using X12, one can also apply likelihood-based decision criteria.

Everytime one is working with automatic model selection, there is an absolute need to be careful about one’s choices. This has to remain in the consciousness of a statistician at all times but these models could at least always be used for comparison purposes. Even though X12-ARIMA is available as a separate binary, being able to use it directly from R is certainly an advantage. Alexander Kowarik and Angelika Meraner wrote a package for R, called x12, that allows X12 to run indeed directly from it. It works pretty nicely and provides a single estimation example of running X12 in R with some explanations on using the options. However, when I first encountered it, the package turned out to have a couple of particularities whick took me some time to figure out as very little documentation is available on the internet.

Hopefully, these may be of use to someone who just got their hands on this package:

  • when using x12, your data has to be a ts object, xts or zoo won’t do it
  • the example available with the package contains integer only data on passengers. Since I was working with log transformations, I couldn’t figure out why my data fitting wasn’t working. It actually needed numbers to be rounded up to at most 7 digits. Even specifying an integer as an option for decimals does not seem to help. You just need to round the data yourself.
  • regvariables option actually stands for predetermined variables as explained in the manual, page 29. Having never worked with X12 before, I found the name a little confusing. If you want to introduce some external control variables, you have to put those in a plain text file (regfile) and separate variables by columns while indicating their names as a character vector in reguser option.
  • forecast_years and backcast_years takes integer values as well as .25, .5 and .75, even for monthly data.
As a general remark, the way X12 takes into account outliers actually could matter a lot. It is quite well explained in its reference manual and one should investigate the differences that arise from the 4 options for outlier detection, i.e. outlier, critical, outlier_span and outlier_method.

EDIT: since this blog post has been written, an updated version of the package has been released and some points discussed are either different or no longer relevant. For example, rounding doesn’t seem to be an issue anymore and options for external regressors are now regression.user and regression.file. Hats off for Alexander and Angelika for their great work!