[R] [ifelse] how to maintain a value from original matrix without probs?

Fri Oct 31 18:09:12 CET 2008

on 10/31/2008 09:59 AM Diogo André Alagador wrote:
> Dear all,
>  
> I have a matrix with positive and negative values.
>>From this I would like to produce 2 matrices:
> 1st - retaining positives and putting NA in other positions
> 2nd - retaining negatives and putting NA in other positions
>  
> and then apply rowMeans for both.
>  
> I am trying to use the function ifelse in the exemplified form:
> ifelse(A>0,A,NA)
> but by putting A as a 2nd parameter it changes dimensions of the original
> object.
>  
> I wonder if I can do this, as it seems not to difficult.
>  
> thanks in advance

A couple of approaches, depending upon the size of the matrix.

The first, if the matrix is "small-ish":

set.seed(1)
mat <- matrix(sample(-10:10, 100, replace = TRUE), ncol = 10)

> mat
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]   -5   -6    9    0    7    0    9   -3   -1    -5
 [2,]   -3   -7   -6    2    3    8   -4    7    4    -9
 [3,]    2    4    3    0    6   -1   -1   -3   -2     3
 [4,]    9   -2   -8   -7    1   -5   -4   -3   -4     8
 [5,]   -6    6   -5    7    1   -9    3    0    5     6
 [6,]    8    0   -2    4    6   -8   -5    8   -6     6
 [7,]    9    5  -10    6  -10   -4    0    8    4    -1
 [8,]    3   10   -2   -8    0    0    6   -2   -8    -2
 [9,]    3   -3    8    5    5    3   -9    6   -5     7
[10,]   -9    6   -3   -2    4   -2    8   10   -7     2

> apply(mat, 1, function(x) mean(x[x > 0], na.rm = TRUE))
 [1] 8.333333 4.800000 3.600000 6.000000 4.666667 6.400000 6.400000
 [8] 6.333333 5.285714 6.000000

> apply(mat, 1, function(x) mean(x[x < 0], na.rm = TRUE))
 [1] -4.000000 -5.800000 -1.750000 -4.714286 -6.666667 -5.250000
 [7] -6.250000 -4.400000 -5.666667 -4.600000

This way, you avoid splitting the matrix. You did not specify how you
might want 0's to be handled, so adjust the logic above accordingly.

If the matrix is large, such that that splitting it and using rowMeans()
would be faster:

mat.pos <- mat
is.na(mat.pos) <- mat <= 0

> mat.pos
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]   NA   NA    9   NA    7   NA    9   NA   NA    NA
 [2,]   NA   NA   NA    2    3    8   NA    7    4    NA
 [3,]    2    4    3   NA    6   NA   NA   NA   NA     3
 [4,]    9   NA   NA   NA    1   NA   NA   NA   NA     8
 [5,]   NA    6   NA    7    1   NA    3   NA    5     6
 [6,]    8   NA   NA    4    6   NA   NA    8   NA     6
 [7,]    9    5   NA    6   NA   NA   NA    8    4    NA
 [8,]    3   10   NA   NA   NA   NA    6   NA   NA    NA
 [9,]    3   NA    8    5    5    3   NA    6   NA     7
[10,]   NA    6   NA   NA    4   NA    8   10   NA     2

> rowMeans(mat.pos, na.rm = TRUE)
 [1] 8.333333 4.800000 3.600000 6.000000 4.666667 6.400000 6.400000
 [8] 6.333333 5.285714 6.000000

mat.neg <- mat
is.na(mat.neg) <- mat >= 0

> mat.neg
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]   -5   -6   NA   NA   NA   NA   NA   -3   -1    -5
 [2,]   -3   -7   -6   NA   NA   NA   -4   NA   NA    -9
 [3,]   NA   NA   NA   NA   NA   -1   -1   -3   -2    NA
 [4,]   NA   -2   -8   -7   NA   -5   -4   -3   -4    NA
 [5,]   -6   NA   -5   NA   NA   -9   NA   NA   NA    NA
 [6,]   NA   NA   -2   NA   NA   -8   -5   NA   -6    NA
 [7,]   NA   NA  -10   NA  -10   -4   NA   NA   NA    -1
 [8,]   NA   NA   -2   -8   NA   NA   NA   -2   -8    -2
 [9,]   NA   -3   NA   NA   NA   NA   -9   NA   -5    NA
[10,]   -9   NA   -3   -2   NA   -2   NA   NA   -7    NA

> rowMeans(mat.neg, na.rm = TRUE)
 [1] -4.000000 -5.800000 -1.750000 -4.714286 -6.666667 -5.250000
 [7] -6.250000 -4.400000 -5.666667 -4.600000

See ?is.na and note the assignment variant.

HTH,

Marc Schwartz