[R] Unexpected behavior of "apply" when FUN=sample

Enrico Schumann es at enricoschumann.net
Tue May 14 11:51:41 CEST 2013


On Tue, 14 May 2013, Luca Nanetti <luca.nanetti at gmail.com> writes:

> Dear experts,
>
> I wanted to signal a peculiar, unexpected behaviour of 'apply'. It is not a
> bug, it is per spec, but it is so counterintuitive that I thought it could
> be interesting.
>
> I have an array, let's say "test", dim=c(7,5).
>
>> test <- array(1:35, dim=c(7, 5))
>> test
>
>      [,1] [,2] [,3] [,4] [,5]
> [1,]    1    8   15   22   29
> [2,]    2    9   16   23   30
> [3,]    3   10   17   24   31
> [4,]    4   11   18   25   32
> [5,]    5   12   19   26   33
> [6,]    6   13   20   27   34
> [7,]    7   14   21   28   35
>
> I want a new array where the content of the rows (columns) are permuted,
> differently per row (per column)
>
> Let's start with the columns, i.e. the second MARGIN of the array:
>> test.m2 <- apply(test, 2, sample)
>> test.m2
>
>      [,1] [,2] [,3] [,4] [,5]
> [1,]    1   10   18   23   32
> [2,]    7    9   16   25   30
> [3,]    6   14   17   22   33
> [4,]    4   11   15   24   34
> [5,]    2   12   21   28   31
> [6,]    5    8   20   26   29
> [7,]    3   13   19   27   35
>
> perfect. That was exactly what I wanted: the content of each column is
> shuffled, and differently for each column.
> However, if I use the same with the rows (MARGIIN = 1), the output is
> transposed!
>
>> test.m1 <- apply(test, 1, sample)
>> test.m1
>
>      [,1] [,2] [,3] [,4] [,5] [,6] [,7]
> [1,]    1    2    3    4    5   13   21
> [2,]   22   30   17   18   19   20   35
> [3,]   15   23   24   32   26   27   14
> [4,]   29   16   31   25   33   34   28
> [5,]    8    9   10   11   12    6    7
>
> In other words, I wanted to permute the content of the rows of "test", and
> I expected to see in the output, well, the shuffled rows as rows, not as
> column!
>
> I would respectfully suggest to make this behavior more explicit in the
> documentation.

As you said yourself, this behaviour is documented:

  "If each call to ‘FUN’ returns a vector of length ‘n’, then ‘apply’
  returns an array of dimension ‘c(n, dim(X)[MARGIN])’ [...]"

And it has nothing to do with 'sample'. Try:

  apply(test, 1, function(x) x)
  apply(test, 2, function(x) x)

The result is only counterintuitive (or inconvenient, perhaps) in the
special case in which apply is supposed to return an array that has the
same dimension as its input.  More generally, you will do something like 

  apply(test, 1, median)
  apply(test, 1, function(x) list(sum = sum(x), values = x))

and in such cases, apply does not return an array.


      
-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net



More information about the R-help mailing list