[R] Remove
Jeff Newmiller
jdnewmil at dcn.davis.ca.us
Sat Dec 9 07:27:04 CET 2017
In this case I cannot see an advantage to using dplyr over subset, other
than if dplyr is your hammer then the problem will look like a nail, or if
this is one step in a larger context where dplyr is more useful.
Nor do I think this is a good use for mapply (or dplyr::group_by) because
the groups are handled differently... better to introduce a data-driven
columnar approach than to have three separate algorithms and bind the data
frames together again.
Here are three ways I came up with. I sometimes use a variation of method
3 when the logical tests are rather more complicated than this and I want
to characterize those tests in the final report.
####### reprex
DM <- read.table( text =
"GR x y
A 25 125
A 23 135
A 14 145
A 35 230
B 45 321
B 47 512
B 53 123
B 55 451
C 61 521
C 68 235
C 85 258
C 80 654", header = TRUE, stringsAsFactors = FALSE )
# 1 Hardcoded logic
DM1 <- subset( DM
, "A" == GR & 15 <= x & x <= 30
| "B" == GR & 40 <= x & x <= 50
| "C" == GR & 60 <= x & x <= 75
)
DM1
#> GR x y
#> 1 A 25 125
#> 2 A 23 135
#> 5 B 45 321
#> 6 B 47 512
#> 9 C 61 521
#> 10 C 68 235
# 2 relational approach
cond <- read.table( text =
"GR minx maxx
A 15 30
B 40 50
C 60 75
", header = TRUE )
DM2 <- merge( DM, cond, by = "GR" )
DM2 <- subset( DM2, minx <= x & x <= maxx, select = -c( minx, maxx ) )
DM2
#> GR x y
#> 1 A 25 125
#> 2 A 23 135
#> 5 B 45 321
#> 6 B 47 512
#> 9 C 61 521
#> 10 C 68 235
# 3 Construct selection vector
sel <- rep( FALSE, nrow( DM ) )
for ( i in seq.int( nrow( cond ) ) ) {
sel <- sel | ( cond$GR[ i ] == DM$GR
& cond$minx[ i ] <= DM$x
& DM$x <= cond$maxx[ i ]
)
}
DM3 <- DM[ sel, ]
DM3
#> GR x y
#> 1 A 25 125
#> 2 A 23 135
#> 5 B 45 321
#> 6 B 47 512
#> 9 C 61 521
#> 10 C 68 235
#######
On Fri, 8 Dec 2017, Michael Hannon wrote:
> library(dplyr)
>
> DM <- read.table( text='GR x y
> A 25 125
> A 23 135
> .
> .
> .
> )
>
> DM %>% filter((GR == "A" & (x >= 15) & (x <= 30)) |
> (GR == "B" & (x >= 40) & (x <= 50)) |
> (GR == "C" & (x >= 60) & (x <= 75)))
>
>
> On Fri, Dec 8, 2017 at 4:48 PM, Ashta <sewashm at gmail.com> wrote:
>> Hi David, Ista and all,
>>
>> I have one related question Within one group I want to keep records
>> conditionally.
>> example within
>> group A I want keep rows that have " x" values ranged between 15 and 30.
>> group B I want keep rows that have " x" values ranged between 40 and 50.
>> group C I want keep rows that have " x" values ranged between 60 and 75.
>>
>>
>> DM <- read.table( text='GR x y
>> A 25 125
>> A 23 135
>> A 14 145
>> A 35 230
>> B 45 321
>> B 47 512
>> B 53 123
>> B 55 451
>> C 61 521
>> C 68 235
>> C 85 258
>> C 80 654',header = TRUE, stringsAsFactors = FALSE)
>>
>>
>> The end result will be
>> A 25 125
>> A 23 135
>> B 45 321
>> B 47 512
>> C 61 521
>> C 68 235
>>
>> Thank you
>>
>> On Wed, Dec 6, 2017 at 10:34 PM, David Winsemius <dwinsemius at comcast.net> wrote:
>>>
>>>> On Dec 6, 2017, at 4:27 PM, Ashta <sewashm at gmail.com> wrote:
>>>>
>>>> Thank you Ista! Worked fine.
>>>
>>> Here's another (possibly more direct in its logic?):
>>>
>>> DM[ !ave(DM$x, DM$GR, FUN= function(x) {!length(unique(x))==1}), ]
>>> GR x y
>>> 5 B 25 321
>>> 6 B 25 512
>>> 7 B 25 123
>>> 8 B 25 451
>>>
>>> --
>>> David
>>>
>>>> On Wed, Dec 6, 2017 at 5:59 PM, Ista Zahn <istazahn at gmail.com> wrote:
>>>>> Hi Ashta,
>>>>>
>>>>> There are many ways to do it. Here is one:
>>>>>
>>>>> vars <- sapply(split(DM$x, DM$GR), var)
>>>>> DM[DM$GR %in% names(vars[vars > 0]), ]
>>>>>
>>>>> Best
>>>>> Ista
>>>>>
>>>>> On Wed, Dec 6, 2017 at 6:58 PM, Ashta <sewashm at gmail.com> wrote:
>>>>>> Thank you Jeff,
>>>>>>
>>>>>> subset( DM, "B" != x ), this works if I know the group only.
>>>>>> But if I don't know that group in this case "B", how do I identify
>>>>>> group(s) that all elements of x have the same value?
>>>>>>
>>>>>> On Wed, Dec 6, 2017 at 5:48 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:
>>>>>>> subset( DM, "B" != x )
>>>>>>>
>>>>>>> This is covered in the Introduction to R document that comes with R.
>>>>>>> --
>>>>>>> Sent from my phone. Please excuse my brevity.
>>>>>>>
>>>>>>> On December 6, 2017 3:21:12 PM PST, David Winsemius <dwinsemius at comcast.net> wrote:
>>>>>>>>
>>>>>>>>> On Dec 6, 2017, at 3:15 PM, Ashta <sewashm at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> Hi all,
>>>>>>>>> In a data set I have group(GR) and two variables x and y. I want to
>>>>>>>>> remove a group that have the same record for the x variable in each
>>>>>>>>> row.
>>>>>>>>>
>>>>>>>>> DM <- read.table( text='GR x y
>>>>>>>>> A 25 125
>>>>>>>>> A 23 135
>>>>>>>>> A 14 145
>>>>>>>>> A 12 230
>>>>>>>>> B 25 321
>>>>>>>>> B 25 512
>>>>>>>>> B 25 123
>>>>>>>>> B 25 451
>>>>>>>>> C 11 521
>>>>>>>>> C 14 235
>>>>>>>>> C 15 258
>>>>>>>>> C 10 654',header = TRUE, stringsAsFactors = FALSE)
>>>>>>>>>
>>>>>>>>> In this example the output should contain group A and C as group B
>>>>>>>>> has the same record for the variable x .
>>>>>>>>>
>>>>>>>>> The result will be
>>>>>>>>> A 25 125
>>>>>>>>> A 23 135
>>>>>>>>> A 14 145
>>>>>>>>> A 12 230
>>>>>>>>> C 11 521
>>>>>>>>> C 14 235
>>>>>>>>> C 15 258
>>>>>>>>> C 10 654
>>>>>>>>
>>>>>>>> Try:
>>>>>>>>
>>>>>>>> DM[ !duplicated(DM$x) , ]
>>>>>>>>>
>>>>>>>>> How do I do it R?
>>>>>>>>> Thank you.
>>>>>>>>>
>>>>>>>>> ______________________________________________
>>>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>>>> PLEASE do read the posting guide
>>>>>>>> http://www.R-project.org/posting-guide.html
>>>>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>>>>
>>>>>>>> David Winsemius
>>>>>>>> Alameda, CA, USA
>>>>>>>>
>>>>>>>> 'Any technology distinguishable from magic is insufficiently advanced.'
>>>>>>>> -Gehm's Corollary to Clarke's Third Law
>>>>>>>>
>>>>>>>> ______________________________________________
>>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>>> PLEASE do read the posting guide
>>>>>>>> http://www.R-project.org/posting-guide.html
>>>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> David Winsemius
>>> Alameda, CA, USA
>>>
>>> 'Any technology distinguishable from magic is insufficiently advanced.' -Gehm's Corollary to Clarke's Third Law
>>>
>>>
>>>
>>>
>>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
More information about the R-help
mailing list