[R] Effect of an optimized BLAS library on lm() function
Stéphane Tufféry
tuffery.stephane at gmail.com
Wed Aug 3 21:37:13 CEST 2016
Dear All,
My question is simple (but the answer perhaps less): how does R lm()
function benefit from an optimized BLAS library (such as ATLAS, OpenBLAS, or
the MKL library in R Open)? I suppose that these BLAS optimizations
concentrate on the level-3 BLAS, and lm() function relies on level-1 BLAS
LINPACK
Nevertheless, the following test on my laptop seems to indicate
slighter faster calculations with lm() when using MKL library.
> set.seed(123)
> m <- 100000
> n <- 100
> matest <- matrix(rnorm(m*n,0,2), m, n)
> y <- rnorm(m)
> x <- cbind(1, matest)
> test <- data.frame(matest, y = y)
> n <- names(test)
> f <- as.formula(paste("y ~", paste(n[!n %in% "y"], collapse = " + ")))
> bigtest <- as.big.matrix(cbind(matest, y))
> options(bigmemory.allow.dimnames=TRUE)
> colnames(bigtest) <- names(test)
> compare <- microbenchmark(lm(y ~ ., data=test), lm.fit(x,y), biglm(f,
data=test),
+ bigglm(f, data=test, family=gaussian()), speedglm(y ~ ., data=test,
family=gaussian()),
+ fastLm(y ~ ., data=test, family=gaussian()),
+ biglm.big.matrix(f, data=bigtest, chunksize=100000),
+ solve(crossprod(x), crossprod(x,y)),
+ solve(t(x)%*%x, t(x)%*%y),
+ qr.coef(qr(x, LAPACK = T), y),
+ qr.coef(qr(x, LAPACK = F), y), times=30, unit="ms")
# on R base
> compare
Unit: milliseconds
expr min lq
mean median uq max neval
lm(y ~ ., data = test) 2423.4618 2497.0986
2587.6717 2528.4108 2605.8664 3245.0846 30
lm.fit(x, y) 1785.2536 1816.0466
1864.9683 1841.6295 1868.1580 2166.6797 30
biglm(f, data = test) 1947.5808 2001.2834
2044.3119 2021.8473 2084.5147 2304.7108 30
bigglm(f, data = test, family = gaussian()) 4461.9307 4544.2631
4726.1232 4629.1442 4705.2304 6240.9467 30
speedglm(y ~ ., data = test, family = gaussian()) 1651.4826 1670.5076
1727.1039 1696.9502 1771.4395 1946.8591 30
fastLm(y ~ ., data = test, family = gaussian()) 3939.3955 4051.2525
4214.4684 4149.4943 4301.5698 5037.1337 30
biglm.big.matrix(f, data = bigtest, chunksize = 1e+05) 2176.9082 2283.2670
2336.6303 2345.2608 2383.1442 2478.3213 30
solve(crossprod(x), crossprod(x, y)) 839.3229 844.4637
867.8926 856.9103 882.7036 949.5098 30
solve(t(x) %*% x, t(x) %*% y) 1436.7063 1464.4559
1531.5375 1494.3093 1532.6056 1916.9786 30
qr.coef(qr(x, LAPACK = T), y) 1989.8013 2031.9686
2091.0366 2063.0931 2128.2705 2317.7407 30
qr.coef(qr(x, LAPACK = F), y) 1752.9953 1809.3616
1856.6803 1826.0688 1887.2954 2111.4479 30
# on R Open (2 threads)
> getMKLthreads()
[1] 2
> compare
Unit: milliseconds
expr min
lq mean median uq max neval
lm(y ~ ., data = test) 1788.61291
1875.63929 1976.74912 1928.96796 2082.2125 2371.3407 30
lm.fit(x, y) 1097.40423
1194.82479 1256.65020 1263.27286 1299.7869 1451.5524 30
biglm(f, data = test) 1936.05859
1983.08455 2041.44009 2019.54653 2103.5658 2211.2873 30
bigglm(f, data = test, family = gaussian()) 4707.12674
4795.69165 5005.35978 5023.22328 5193.1482 5340.9577 30
speedglm(y ~ ., data = test, family = gaussian()) 881.47945
934.77104 996.59551 975.93888 1038.1470 1212.8197 30
fastLm(y ~ ., data = test, family = gaussian()) 1671.48208
1810.42245 1883.39515 1880.31249 1936.2377 2324.2279 30
biglm.big.matrix(f, data = bigtest, chunksize = 1e+05) 2196.43003 2266.86705
2339.93722 2355.17538 2401.8879 2585.1069 30
solve(crossprod(x), crossprod(x, y)) 63.50803
70.25601 92.00605 74.73306 101.0346 363.4992 30
solve(t(x) %*% x, t(x) %*% y) 269.83961
298.29031 340.11531 320.85419 350.4334 541.5970 30
qr.coef(qr(x, LAPACK = T), y) 864.66990
941.90109 1000.73358 985.70631 1030.5733 1341.8595 30
qr.coef(qr(x, LAPACK = F), y) 1155.82456
1253.15555 1309.30060 1299.67349 1354.4523 1556.7010 30
# on R Open (1 thread)
> getMKLthreads()
[1] 1
> compare
Unit: milliseconds
expr min lq
mean median uq max neval
lm(y ~ ., data = test) 1991.4313 2034.4098
2111.0498 2104.9173 2182.6686 2379.2731 30
lm.fit(x, y) 1329.0017 1347.4155
1437.7724 1409.5693 1519.5183 1690.3965 30
biglm(f, data = test) 1922.7935 1979.4146
2030.7355 2008.3577 2066.5151 2265.3957 30
bigglm(f, data = test, family = gaussian()) 4474.7666 4534.6946
4589.0903 4594.6761 4637.1702 4744.8987 30
speedglm(y ~ ., data = test, family = gaussian()) 918.9388 959.5755
1015.1099 1000.7063 1066.6677 1178.5648 30
fastLm(y ~ ., data = test, family = gaussian()) 1737.4001 1785.1347
1908.2960 1905.7776 1992.8247 2188.0665 30
biglm.big.matrix(f, data = bigtest, chunksize = 1e+05) 2172.3621 2242.7618
2309.5848 2317.5166 2362.6432 2434.8259 30
solve(crossprod(x), crossprod(x, y)) 122.5138 125.0064
132.2125 128.8345 137.1886 157.8529 30
solve(t(x) %*% x, t(x) %*% y) 330.3578 336.0520
390.2391 379.9801 413.4324 530.6566 30
qr.coef(qr(x, LAPACK = T), y) 917.1596 943.9504
987.5824 971.1059 1001.2241 1231.8204 30
qr.coef(qr(x, LAPACK = F), y) 1313.7414 1348.3933
1439.4807 1480.5955 1518.4831 1569.0909 30
We notice that solve(crossprod(x), crossprod(x, y)) is faster than
solve(t(x) %*% x, t(x) %*% y), because the function crossprod(x) exploits
the fact that the matrix t(x) %*% x is symmetric to calculate only half of
the matrix.
We notice that solve(t(x) %*% x, t(x) %*% y) is faster than qr.coef(qr(x),
y), because the function solve() relies on the Cholevsky decomposition,
which is approximately twice as fast as the QR decomposition.
We notice that qr.coef(qr(x, LAPACK = T), y) is faster than qr.coef(qr(x,
LAPACK = F), y), at least with an optimized BLAS library which allows a
saving of time with regard to the use of LINPACK.
We notice that the calculation time of qr.coef(qr(x, LAPACK = F), y) is of
the same order as that of lm.fit() which also uses a QR decomposition based
on LINPACK.
Of course, the calculation time of lm() is more important than that of
lm.fit(), because lm() has to estimate the formula and make additional
calculations.
We also notice that the functions biglm() and bigglm() of package biglm do
not benefit from the optimized library BLAS MKL, but that all other
functions benefit from it, including lm() and lm.fit(), and even with only
one thread.
Sincerely,
Stéphane
[[alternative HTML version deleted]]
More information about the R-help
mailing list