[R] Bug in gmodels CrossTable()?
Marc Schwartz
marc_schwartz at me.com
Sun May 31 15:41:05 CEST 2009
On May 31, 2009, at 7:51 AM, Jakson Alves de Aquino wrote:
> Is the code below showing a bug in Crosstable()? My expectation was
> that
> the values produced by xtabs were rounded instead of truncated:
>
> library(gmodels)
> abc <- c("a", "a", "b", "b", "c", "c")
> def <- c("d", "e", "f", "f", "d", "e")
> wgt <- c(0.8, 0.6, 0.4, 0.5, 1.4, 1.3)
>
> xtabs(wgt ~ abc + def)
>
> CrossTable(xtabs(wgt ~ abc + def), prop.r = F, prop.c = F,
> prop.t = F, prop.chisq = F)
CrossTable() is designed to take one or two vectors, which are then
[cross-]tabulated to yield integer counts, OR a matrix of integer
counts, not fractional values. In the latter case, it is presumed that
the matrix is the result of an 'a priori' cross-tabulation operation
such as the use of table().
The output of xtabs() above is:
> xtabs(wgt ~ abc + def)
def
abc d e f
a 0.8 0.6 0.0
b 0.0 0.0 0.9
c 1.4 1.3 0.0
The relevant output of CrossTable() in your example above shows:
| def
abc | d | e | f | Row Total |
-------------|-----------|-----------|-----------|-----------|
a | 0 | 0 | 0 | 1 |
-------------|-----------|-----------|-----------|-----------|
b | 0 | 0 | 0 | 0 |
-------------|-----------|-----------|-----------|-----------|
c | 1 | 1 | 0 | 2 |
-------------|-----------|-----------|-----------|-----------|
Column Total | 2 | 1 | 0 | 5 |
-------------|-----------|-----------|-----------|-----------|
The internal table object that would be generated here is effectively:
> addmargins(xtabs(wgt ~ abc + def))
def
abc d e f Sum
a 0.8 0.6 0.0 1.4
b 0.0 0.0 0.9 0.9
c 1.4 1.3 0.0 2.7
Sum 2.2 1.9 0.9 5.0
The textual output of CrossTable() is internally formatted using
formatC(..., format = "d"), which is an integer based format:
> formatC(addmargins(xtabs(wgt ~ abc + def)), format = "d")
def
abc d e f Sum
a 0 0 0 1
b 0 0 0 0
c 1 1 0 2
Sum 2 1 0 5
In other words, you are getting the integer coerced values of the
individual cells and then the same for the column, row and table totals:
> matrix(as.integer(addmargins(xtabs(wgt ~ abc + def))), 4, 4)
[,1] [,2] [,3] [,4]
[1,] 0 0 0 1
[2,] 0 0 0 0
[3,] 1 1 0 2
[4,] 2 1 0 5
If you review ?as.integer, you will note the following in the 'Value'
section:
Non-integral numeric values are truncated towards zero (i.e.,
as.integer(x) equals trunc(x) there)
The output is correct, if confusing, but you are really using the
function in a fashion that is not intended.
HTH,
Marc Schwartz
More information about the R-help
mailing list