[R] how to do something like " subset(mat, ("col1">4 & "col2">4)) "
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Fri Sep 9 16:23:22 CEST 2005
Florence Combes <fcombes at gmail.com> writes:
> Dear all,
>
> I have a problem with the "subset()" function. I spent all day yesterday
> with a collegue to solve it and we did not find a satisfying solution (even
> in the archived mails), so I ask for your help.
> Let's say (for a simple example) a matrix mat:
>
> R> mat
> cola colb colc
> [1,] 1 4 7
> [2,] 2 5 8
> [3,] 3 6 9
>
> My goal is to select the lines of the matrix on the basis of the values of
> more than one column (let's say colb and colc).
> For example I want to select all the lines of the matrix for which values in
> colb and colc are more than 4.
>
> I tried several ways that did not work:
>
> R> mat2 <- subset(mat, ("colb">4 & "colc">4))
> R> mat2
> [1] 1 2 3 4 5 6 7 8 9
>
> it is a vector, not a matrix.
>
> > mat2 <- subset(mat, mat[,2:3]>4)
> > mat2
> [1] 2 3 4 5 6 8 9
>
> tha same: it is a vector; so I tried:
>
> > mat2 <- as.matrix(subset(mat, mat[,("colb">4 & "colc">4)]))
> > mat2
> [,1]
> [1,] 1
> [2,] 2
> [3,] 3
> [4,] 4
> [5,] 5
> [6,] 6
> [7,] 7
> [8,] 8
> [9,] 9
>
> not good :(
>
> Did someone have an idea of how to select the only the lines 2 and 3 of mat
> by a selection on "colb" and "colc" >4 ?
Well, subset has methods for vectors and data frames, so what happens
for matrices is basically that they get converted to vectors. I don't
know what gave you the idea of quoting the names, but
"colb">4
is TRUE because numbers sort before letters!
Try something like
as.matrix(subset(as.data.frame(mat),colb>4 & colc>4))
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list