[R] about ARMA(p,q) SCAN method: SAS vs. R
Steve Chen
steve at stat.tku.edu.tw
Tue Apr 6 11:42:27 CEST 2010
Hi all,
I am modifying a program I wrote before to perform smallest canonical
(SCAN) correlation method for identification of ARMA(p,q) orders in Time
Series, but when I compared the output with SAS, there are some differences.
My SCAN R code can be downloaded in the following URL:
http://netstat.stat.tku.edu.tw/download/arma_scan_R.txt
I used Series_R (LA Ozone data) in Box and Jenkins(4th edition) as
example. A sample run can be done via
# ozone = scan("http://netstat.stat.tku.edu.tw/download/box_ozone.txt")
# source("http://netstat.stat.tku.edu.tw/download/arma_scan_R.txt")
# arma.scan(ozone)
First is the output of Squared Canonical Correlation Estimates:
SAS:
Squared Canonical Correlation Estimates
Lags MA 0 MA 1 MA 2 MA 3 MA 4 MA 5
AR 0 0.5352 0.2423 0.0696 0.0035 0.0112 0.0183
AR 1 0.0074 0.0199 0.0304 0.0399 0.0185 0.0052
AR 2 0.0173 0.0005 0.0003 0.0167 0.0123 0.0198
AR 3 0.0190 0.0003 0.0002 0.0230 0.0026 0.0287
AR 4 0.0130 0.0262 0.0214 0.0054 0.0206 0.0302
AR 5 0.0143 0.0068 0.0229 0.0230 0.0171 0.0187
My R-code:
MA-0 MA-1 MA-2 MA-3 MA-4 MA-5
AR-0 0.5264 0.2342 0.0668 0.0033 0.0105 0.0000
AR-1 0.0080 0.0197 0.0299 0.0399 0.0183 0.0052
AR-2 0.0158 0.0005 0.0003 0.0167 0.0122 0.0198
AR-3 0.0153 0.0003 0.0002 0.0229 0.0025 0.0283
AR-4 0.0099 0.0262 0.0214 0.0054 0.0204 0.0302
AR-5 0.0116 0.0066 0.0225 0.0229 0.0174 0.0190
The results are similar. The main differences is in
the Chi-Square P-values:
SAS:
SCAN Chi-Square[1] Probability Values
Lags MA 0 MA 1 MA 2 MA 3 MA 4 MA 5
AR 0 <.0001 <.0001 0.0148 0.6003 0.3472 0.2307
AR 1 0.2073 0.0407 0.0164 0.0183 0.2326 0.3313
AR 2 0.0532 0.7927 0.8537 0.1190 0.1934 0.2555
AR 3 0.0435 0.8326 0.8736 0.1273 0.5318 0.0537
AR 4 0.0960 0.0356 0.1365 0.4074 0.1100 0.0910
AR 5 0.0812 0.3110 0.0288 0.0997 0.1517 0.1440
My R-code:
Chi-Square(1) Test p-value
MA-0 MA-1 MA-2 MA-3 MA-4 MA-5
AR-0 0.0000 0.0004 0.0749 0.6971 0.4903 0.0000
AR-1 0.1880 0.2355 0.1496 0.1129 0.3625 0.5475
AR-2 0.0648 0.8515 0.9024 0.3151 0.3900 0.3592
AR-3 0.0696 0.8813 0.9112 0.2237 0.6875 0.1978
AR-4 0.1458 0.1738 0.2666 0.5628 0.2827 0.2174
AR-5 0.1168 0.4962 0.2103 0.2507 0.3148 0.3021
I check the original paper by Tsay and Tiao:
Tsay, R.S. and Tiao, G.C. (1985). Use of Canonical Analysis in Time
Series Model Identification. Biometrika,72 ,299-315.
and comapre the formula with SAS ETS manual, e.g.
http://support.sas.com/documentation/cdl/en/etsug/60372/HTML/default/etsug_arima_sect031.htm
I found that the formula of d(m,j) in SAS manual is wrong. The correct
fomula for d(m,j) should be something like
d(m,j) = 1 + 2*(r_1^2 + r_2^2 + ... + r_j^2)
but in SAS ETS manual, it is
d(m,j) = 1 + 2*(r_1 + r_2 + ... + r_(j-1))
I plan to wrap my SCAN code and some other R codes for Time Series into
a package, but with the P-value difference from SAS output, I am not
sure whether my R-code for SCAN is fine enough for real application.
Any suggestion ? Thank you in advance.
Steve Chen
Associate Professor, Department of Statistics
Tamkang University, Taiwan
More information about the R-help
mailing list