[Rd] Bug in the "reformulate" function in stats package
Martin Maechler
m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Thu Apr 18 22:03:46 CEST 2019
>>>>> Ben Bolker
>>>>> on Thu, 18 Apr 2019 11:51:33 -0400 writes:
> Your file didn't make it through the mailing list (which is quite
> restrictive about which types/extensions it will take).
> I appreciate your enthusiasm and persistence for this issue, but I
> suspect you may have trouble convincing R-core to adopt your changes --
> they are "better", "easier", "more intuitive" for you ... but how sure
> are you they are completely backward compatible, have no performance
> issues, will not break in unusual cases ... ?
> Hopefully someone here will set up a bugzilla account so you can post
> your patch/it can be further discussed there, if you want to purseu this ...
This case has been closed quite a while ago, thank you.
The changes will be in R 3.6.0 that'll come in 8 days, not the
least thanks to Ben's patch (earlier in this thread).
Martin Maechler
> cheers
> Ben Bolker
> On 2019-04-18 7:30 a.m., Saren Tasciyan wrote:
>> Hi,
>>
>> Sorry for writing this late, I was very busy. I started this discussion
>> here. I wish I could write to bugs.r-project.org, but I don't have an
>> account and I will write here instead.
>>
>> Meanwhile, I solved my problem with a simpler fix (please see attached
>> file)/.
>> /
>>
>> This requires that term labels are not "ticked". I think this is better,
>> since it is easier to have column names unticked.
>>
>> New development function is IMO unnecessarily complicated. It requires
>> strings to be ticked or as.name(). It is more intuitive to have a vector
>> of column names.
>>
>> Best,
>>
>> Saren
>>
>>
>> On 05.04.19 09:38, Martin Maechler wrote:
>>>>>>>> Ben Bolker
>>>>>>>> on Thu, 4 Apr 2019 12:46:37 -0400 writes:
>>> > Proposed patch
>>>
>>> Thank you Ben!
>>>
>>>
>>> [the rest is technical nit-picking .. but hopefully interesting
>>> to the smart R-devel reader base:]
>>>
>>> There was a very subtle thinko in your patch which is not easily
>>> diagnosed from R's parse_Rd():
>>>
>>> Error in
>>> parse_Rd("/u/maechler/R/D/r-devel/R/src/library/stats/man/delete.response.Rd",
>>> :
>>> Unexpected end of input (in " quoted string opened at
>>> delete.response.Rd:78:63)
>>> In addition: Warning message:
>>> In
>>> parse_Rd("/u/maechler/R/D/r-devel/R/src/library/stats/man/delete.response.Rd",
>>> :
>>> newline within quoted string at delete.response.Rd:74
>>>
>>> and even I needed more than a minute to find out that the
>>> culprit was that
>>>
>>> reformulate(sprintf("`%s`", x))
>>>
>>> is not ok in *.Rd and must be
>>>
>>> reformulate(sprintf("`\%s`", x))
>>>
>>> ---------
>>>
>>> > (I think .txt files work OK as attachments to the list?)
>>>
>>> yes, typically -- what really counts is if your e-mail program
>>> marks them with MIME-type 'text/plain'
>>> and most E-mail programs are very "silly" / "safe" nowadays and
>>> don't expect to have smart users and hence mark (and sometimes
>>> encode) everything unknown as non-text.
>>>
>>> Using very old flexible e-mail interfaces such as Emacs VM allow
>>> you to specify the MIME-type in addition to the file *and* it
>>> also proposes smart defaults, I think by using something like
>>> unix 'file' to determine that your 'foo.diff' file is plain text.
>>> {{ .. and we all know that Windows is sillily using file extensions
>>> to determine file type and only knows Windows-extensions plus
>>> those added explicitly by software installed; so nowadays *.rda
>>> is marked as an Rstudio file ... [argh].
>>> }}
>>>
>>> Martin
>>>
>>> > On 2019-04-04 2:21 a.m., Martin Maechler wrote:
>>> >>>>>>> Ben Bolker
>>> >>>>>>> on Fri, 29 Mar 2019 12:34:50 -0400 writes:
>>> >>
>>> >> > I suspect that the issue is addressed (obliquely) in the
>>> examples,
>>> >> > which shows that variables with spaces in them (or otherwise
>>> >> > 'non-syntactic', i.e. not satisfying the constraints of
>>> legal R symbols)
>>> >> > can be handled by protecting them with backticks (``)
>>> >>
>>> >> > ## using non-syntactic names:
>>> >> > reformulate(c("`P/E`", "`% Growth`"), response = as.name("+-"))
>>> >>
>>> >> > It seems to me there could be room for a *documentation*
>>> patch (stating
>>> >> > explicitly that if termlabels has length > 1 its elements are
>>> >> > concatenated with "+", and explicitly stating that
>>> non-syntactic names
>>> >> > must be protected with back-ticks). (There is a little bit
>>> of obscurity
>>> >> > in the fact that the elements of termlabels don't have to be
>>> >> > syntactically valid names: many will be included in formulas
>>> if they can
>>> >> > be interpreted as *parseable* expressions, e.g.
>>> reformulate("x<2"))
>>> >>
>>> >> > I would be happy to give it a shot if the consensus is that
>>> it would
>>> >> > be worthwhile.
>>> >>
>>> >> I think it would be worthwhile to add to the docs a bit.
>>> >>
>>> >> [With currently just your and my vote, we have a 100% consensus
>>> >> ;-)]
>>> >>
>>> >> Martin
>>> >>
>>> >> > One workaround to the OP's problem is below (may be worth
>>> including
>>> >> > as an example in docs)
>>> >>
>>> >> >> z <- c("a variable","another variable")
>>> >> >> reformulate(z)
>>> >> > Error in parse(text = termtext, keep.source = FALSE) :
>>> >> > <text>:1:6: unexpected symbol
>>> >> > 1: ~ a variable
>>> >> > ^
>>> >> >> reformulate(sprintf("`%s`",z))
>>> >> > ~`a variable` + `another variable`
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> > On 2019-03-29 11:54 a.m., J C Nash wrote:
>>> >> >> The main thing is to post the "small reproducible example".
>>> >> >>
>>> >> >> My (rather long term experience) can be written
>>> >> >>
>>> >> >> if (exists("reproducible example") ) {
>>> >> >> DeveloperFixHappens()
>>> >> >> } else {
>>> >> >> NULL
>>> >> >> }
>>> >> >>
>>> >> >> JN
>>> >> >>
>>> >> >> On 2019-03-29 11:38 a.m., Saren Tasciyan wrote:
>>> >> >>> Well, first I can't sign in bugzilla myself, that is why I
>>> wrote here first. Also, I don't know if I have the time at
>>> >> >>> the moment to provide tests, multiple examples or more. If
>>> that is not ok or welcomed, that is fine, I can come back,
>>> >> >>> whenever I have more time to properly report the bug.
>>> >> >>>
>>> >> >>> I didn't find the existing bug report, sorry for that.
>>> >> >>>
>>> >> >>> Yes, it is related. My problem was that I have column
>>> names with spaces and current solution doesn't solve it. I have a
>>> >> >>> solution, which works for me and maybe also for others.
>>> >> >>>
>>> >> >>> Either, someone can register me to bugzilla or I can post
>>> it here, which could give some direction to developers. I
>>> >> >>> don't mind whichever is preferred here.
>>> >> >>>
>>> >> >>> Best,
>>> >> >>>
>>> >> >>> Saren
>>> >> >>>
>>> >> >>>
>>> >> >>> On 29.03.19 09:29, Martin Maechler wrote:
>>> >> >>>>>>>>> Saren Tasciyan
>>> >> >>>>>>>>> on Thu, 28 Mar 2019 17:02:10 +0100 writes:
>>> >> >>>> > Hi,
>>> >> >>>> > I have found a bug in reformulate function and
>>> have a solution for it. I
>>> >> >>>> > was wondering, where I can submit it?
>>> >> >>>>
>>> >> >>>> > Best,
>>> >> >>>> > Saren
>>> >> >>>>
>>> >> >>>>
>>> >> >>>> Well, you could have given a small reproducible example
>>> >> >>>> depicting the bug, notably when posting here:
>>> >> >>>> Just a prose text with no R code or other technical
>>> content is
>>> >> >>>> almost always not really appropriate fo the R-devel
>>> mailing list.
>>> >> >>>>
>>> >> >>>> Further, in such a case you should google a bit and
>>> hopefully
>>> >> >>>> have found
>>> >> >>>> https://www.r-project.org/bugs.html
>>> >> >>>>
>>> >> >>>> which also mention reproducibility (and many more useful
>>> things).
>>> >> >>>>
>>> >> >>>> Then it also tells you about R's bug repository, also called
>>> >> >>>> "R's bugzilla" at https://bugs.r-project.org/
>>> >> >>>>
>>> >> >>>> and if you are diligent (but here, I'd say bugzilla is
>>> >> >>>> (configured?) far from ideal), you'd also find bug PR#17359
>>> >> >>>>
>>> >> >>>>
>>> https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17359
>>> >> >>>>
>>> >> >>>> which was reported already on Nov 2017 .. and only fixed
>>> >> >>>> yesterday (in the "cleanup old bugs" process that happens
>>> >> >>>> often before the big new spring release of R).
>>> >> >>>>
>>> >> >>>> So is your bug the same as that one?
>>> >> >>>>
>>> >> >>>> Martin
>>> >> >>>>
>>> >> >>>> > --
>>> >> >>>> > Saren Tasciyan
>>> >> >>>> > /PhD Student / Sixt Group/
>>> >> >>>> > Institute of Science and Technology Austria
>>> >> >>>> > Am Campus 1
>>> >> >>>> > 3400 Klosterneuburg, Austria
>>> >> >>>>
>>> >> >>>> > ______________________________________________
>>> >> >>>> > R-devel using r-project.org mailing list
>>> >> >>>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>>> >> >>>>
>>> >> >>>> ______________________________________________
>>> >> >>>> R-devel using r-project.org mailing list
>>> >> >>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>> >> >>
>>> >> >> ______________________________________________
>>> >> >> R-devel using r-project.org mailing list
>>> >> >> https://stat.ethz.ch/mailman/listinfo/r-devel
>>> >> >>
>>> >>
>>> >> > ______________________________________________
>>> >> > R-devel using r-project.org mailing list
>>> >> > https://stat.ethz.ch/mailman/listinfo/r-devel
>>> >>
>>> > x[DELETED ATTACHMENT external: reformulate.diff, plain text]
>>> > ______________________________________________
>>> > R-devel using r-project.org mailing list
>>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>> ______________________________________________
>>> R-devel using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>> ______________________________________________
>>> R-devel using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list