[R] Searching for specific values in a matrix

Steve Lianoglou mailinglist.honeypot at gmail.com
Mon Jul 27 21:15:46 CEST 2009


On Jul 27, 2009, at 2:54 PM, Mehdi Khan wrote:

> i am able to return the first column, but anything else returns this:
> <0 rows> (or 0-length row.names)
>
> any idea?

I'm not sure what you're doing.

The result you're getting happens when no rows "pass" the logical test  
that you are using to index the rows of your data.frame for.

Can you show the code that you are using (based on the example data  
you gave) that is giving you the <0 rows> result?

-steve

>
> On Tue, Jul 21, 2009 at 12:49 PM, Steve Lianoglou <mailinglist.honeypot at gmail.com 
> > wrote:
>
> On Jul 21, 2009, at 3:27 PM, Mehdi Khan wrote:
>
> I understand your explanation about the test for even numbers.   
> However I am still a bit confused as to how to go about finding a  
> particular value.  Here is an example data set
>
> col #          attr1    attr2   attr 3    LON        LAT
> 17209         D        NA    NA -122.9409 38.27645
> 17210        BC        NA    NA -122.9581 38.36304
> 17211         B        NA    NA -123.6851 41.67121
> 17212        BC        NA    NA -123.0724 38.93073
> 17213         C        NA    NA -123.7240 41.84403
> 17214      <NA>       464    NA -122.9430 38.30988
> 17215         C        NA    NA -123.4442 40.65369
> 17216        BC        NA    NA -122.9389 38.31551
> 17217         C        NA    NA -123.0747 38.97998
> 17218         C        NA    NA -123.6580 41.59610
> 17219         C        NA    NA -123.4513 40.70992
> 17220         C        NA    NA -123.0901 39.06473
> 17221        BC        NA    NA -123.0653 38.94845
> 17222        BC        NA    NA -122.9464 38.36808
> 17223      <NA>       464    NA -123.0143 38.70205
> 17224      <NA>        NA     5 -122.8609 37.94137
> 17225      <NA>        NA     5 -122.8628 37.95057
> 17226      <NA>        NA     7 -122.8646 37.95978
>
> For future reference, perhaps paste this in a way that's easy for us  
> to paste into a running R session so we can use it, like so:
>
> df <- data.frame(
> coln=c(17209, 17210, 17211, 17212, 17213, 17214, 17215, 17216,  
> 17217, 17218, 17219, 17220, 17221, 17222, 17223, 17224, 17225, 17226),
> attr1 
> = 
> c 
> ("D 
> ","BC 
> ","B","BC","C",NA,"C","BC","C","C","C","C","BC","BC",NA,NA,NA,NA),
> attr2=c( NA,NA,NA,NA,NA,464,NA,NA,NA,NA,NA,NA,NA,NA,464,NA,NA,NA),
> attr3=c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,5,5,7),
> LON 
> = 
> c 
> ( -122.9409 
> ,-122.9581 
> ,-123.6851 
> ,-123.0724 
> ,-123.7240 
> ,-122.9430 
> ,-123.4442 
> ,-122.9389 
> ,-123.0747 
> ,-123.6580 
> ,-123.4513 
> ,-123.0901 
> ,-123.0653,-122.9464,-123.0143,-122.8609,-122.8628,-122.8646),
> LAT 
> = 
> c 
> (38.27645,38.36304,41.67121,38.93073,41.84403,38.30988,40.65369,38.31551,38.97998,41.59610,40.70992,39.06473,38.94845,38.36808,38.70205,37.94137,37.95057,37.95978 
> ))
>
>
> If I wanted to find the row with Lat = 37.95978
>
> Using an "indexing vector":
>
> R> lats <- df$LAT == 37.95978
> # or with the %~% from before:
> # lats <- df$LAT %~% 37.95978
> R> df[lats,]
>    coln attr1 attr2 attr3       LON      LAT
> 18 17226  <NA>    NA     7 -122.8646 37.95978
>
> Using the "subset" function:
>
> R> subset(df, LAT == 37.95978)
>    coln attr1 attr2 attr3       LON      LAT
> 18 17226  <NA>    NA     7 -122.8646 37.95978
>
>
> , how would i do that?  How would  I find the rows with BC?
>
> R> subset(df, attr1 == 'BC')
>    coln attr1 attr2 attr3       LON      LAT
> 2  17210    BC    NA    NA -122.9581 38.36304
> 4  17212    BC    NA    NA -123.0724 38.93073
> 8  17216    BC    NA    NA -122.9389 38.31551
> 13 17221    BC    NA    NA -123.0653 38.94845
> 14 17222    BC    NA    NA -122.9464 38.36808
>
>
> If you try with an "indexing vector" the NA's will trip you up:
>
> R> df[df$attr1 == 'BC',]
>      coln attr1 attr2 attr3       LON      LAT
> 2    17210    BC    NA    NA -122.9581 38.36304
> 4    17212    BC    NA    NA -123.0724 38.93073
> NA      NA  <NA>    NA    NA        NA       NA
> 8    17216    BC    NA    NA -122.9389 38.31551
> 13   17221    BC    NA    NA -123.0653 38.94845
> 14   17222    BC    NA    NA -122.9464 38.36808
> NA.1    NA  <NA>    NA    NA        NA       NA
> NA.2    NA  <NA>    NA    NA        NA       NA
> NA.3    NA  <NA>    NA    NA        NA       NA
> NA.4    NA  <NA>    NA    NA        NA       NA
>
> So you could do something like:
>
> > df[df$attr1 == 'BC' & !is.na(df$attr1),]
>    coln attr1 attr2 attr3       LON      LAT
> 2  17210    BC    NA    NA -122.9581 38.36304
> 4  17212    BC    NA    NA -123.0724 38.93073
> 8  17216    BC    NA    NA -122.9389 38.31551
> 13 17221    BC    NA    NA -123.0653 38.94845
> 14 17222    BC    NA    NA -122.9464 38.36808
>
>
> HTH,
> -steve
>
> --
> Steve Lianoglou
> Graduate Student: Physiology, Biophysics and Systems Biology
> Weill Medical College of Cornell University
>
> Contact Info: http://cbio.mskcc.org/~lianos/contact
>
>
>
>

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
   |  Memorial Sloan-Kettering Cancer Center
   |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact




More information about the R-help mailing list