[R] How to subset my dataframe? (a bit tricky)
Don MacQueen
macq at llnl.gov
Wed Jun 17 06:33:31 CEST 2009
I would probably try a different approach than the other suggestions.
Paste all the columns other than pond_id together. You now have a
character vector. Then keep the rows in which an element of the
vector contains "dnv" but does not contain "0 dnv" [use grep()]. This
assumes there are no extraneous space characters in any of the
values. You'd also for this approach want to watch for columns that
have no "dnv" in them, as they might be stored as numeric instead of
character. And make sure all your zeros are formatted exactly as "0".
The caveat is that I haven't tested this.
-Don
At 12:26 PM -0600 6/16/09, Mark Na wrote:
>Hi R-helpers,
>
>I would like to subset my dataframe, keeping only those rows which
>satisfy the following conditions:
>
>1) the string "dnv" is found in at least one column;
>2) the value in the column previous to the one "dnv" is found in is not "0"
>
>Here's what my data look like:
>
> POND_ID 2009-05-07 2009-05-15 2009-05-21 2009-05-28 2009-06-04
>
>4 101 0.15 0 dnv dnv dnv
>7 102 0 dnv dnv dnv dnv
>87 103 0.15 dnv 1 1 1
>99 104 dnv 0.25 1 1 0.75
>
>So, for above example, the new dataframe would not contain POND_ID 101
>or 102 (because there is a 0 before the dnv) but it WOULD contain
>POND_ID 103 (because there is a 0.15 before the dnv) and 104 (because
>dnv occurs in the first column, so cannot be preceded by a 0).
>
>One extra twist: I would like to retain rows in the new dataframe
>which satisfy the above conditions even if they also have a "0" then
>"dnv" sequence preceding or following the "problem" , e.g., the
>following rows would be retained in the new dataframe
>
> POND_ID 2009-05-07 2009-05-15 2009-05-21 2009-05-28 2009-06-04
>
>100 105 0.15 dnv 1 0 dnv
>101 106 0 dnv 1 0.15 dnv
>
>Thanks in advance for any help you might provide.
>
>(I hope I've provided enough of an example; I could also provide a
>.csv file if that would help.)
>
>Mark Na
>
>______________________________________________
>R-help at r-project.org mailing list
>https:// stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http:// www. R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
--
---------------------------------
Don MacQueen
Lawrence Livermore National Laboratory
Livermore, CA, USA
925-423-1062
macq at llnl.gov
More information about the R-help
mailing list