[R] Improvement: function cut
Leonard Mada
|eo@m@d@ @end|ng |rom @yon|c@eu
Sat Sep 18 00:44:12 CEST 2021
The warn should be in cut() => .bincode().
It should be generated whenever a real value (excludes NA or NAN or +/-
Inf) is not included in any of the bins.
If the user writes a script and doesn't want any warnings: he can select
warn = FALSE. But otherwise it would be very helpful to catch
immediately the error (and not after a number of steps or miss the error
altogether).
Leonard
On 9/18/2021 1:28 AM, Jeff Newmiller wrote:
> Re your objection that "the user has to suspect that some values were not included" applies equally to your proposed warn option. There are a lot of ways to introduce NAs... in real projects all analysts should be suspecting this problem.
>
> On September 17, 2021 3:01:35 PM PDT, Leonard Mada via R-help <r-help using r-project.org> wrote:
>> Thank you Andrew.
>>
>>
>> Is there any reason not to make: include.lowest = TRUE the default?
>>
>>
>> Regarding the NA:
>>
>> The user still has to suspect that some values were not included and run
>> that test.
>>
>>
>> Leonard
>>
>>
>> On 9/18/2021 12:53 AM, Andrew Simmons wrote:
>>> Regarding your first point, argument 'include.lowest' already handles
>>> this specific case, see ?.bincode
>>>
>>> Your second point, maybe it could be helpful, but since both
>>> 'cut.default' and '.bincode' return NA if a value isn't within a bin,
>>> you could make something like this on your own.
>>> Might be worth pitching to R-bugs on the wishlist.
>>>
>>>
>>>
>>> On Fri, Sep 17, 2021, 17:45 Leonard Mada via R-help
>>> <r-help using r-project.org <mailto:r-help using r-project.org>> wrote:
>>>
>>> Hello List members,
>>>
>>>
>>> the following improvements would be useful for function cut (and
>>> .bincode):
>>>
>>>
>>> 1.) Argument: Include extremes
>>> extremes = TRUE
>>> if(right == FALSE) {
>>> # include also right for last interval;
>>> } else {
>>> # include also left for first interval;
>>> }
>>>
>>>
>>> 2.) Argument: warn = TRUE
>>>
>>> Warn if any values are not included in the intervals.
>>>
>>>
>>> Motivation:
>>> - reduce risk of errors when using function cut();
>>>
>>>
>>> Sincerely,
>>>
>>>
>>> Leonard
>>>
>>> ______________________________________________
>>> R-help using r-project.org <mailto:R-help using r-project.org> mailing list --
>>> To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> <https://stat.ethz.ch/mailman/listinfo/r-help>
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> <http://www.R-project.org/posting-guide.html>
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list