[Rd] [External] Time to drop globalenv() from searches in package code?
Henrik Bengtsson
henr|k@bengt@@on @end|ng |rom gm@||@com
Wed Oct 5 21:55:55 CEST 2022
Excluding the global environment, and all its parent environments from
the search path that a package sees would be great news, because it
would makes things more robust and probably detect a few more bugs out
there. In addition to the use case that Duncan mentions, it would
also remove the ambiguity that comes from searching attached packages
for global/free variables.
Speaking of the latter, isn't it time to escalate the following to a WARNING?
* checking R code for possible problems ... NOTE
my_fcn: no visible global function definition for ‘var’
Undefined global functions or variables:
var
Consider adding
importFrom("stats", "var")
to your NAMESPACE file.
There are several package on Bioconductor with such mistakes, and I
can imagine there are still some old CRAN package too that haven't
gone from "recent" CRAN incoming checks. The problem here is that the
intended `var` can be overridden in the global environment, in a
third-party attached package, or in any other environment on the
search() path.
Regarding:
> if (require("foo"))
> bar(...)
>
> with bar exported from foo. I don't know if that is already warned
> about. Moving away from this is arguably good in principle but also
> probably fairly disruptive.
This should be coved by 'R CMD check', also without --as-cran; if we use:
hello <- function() {
msg <- "Hello world!"
if (require("tools")) msg <- toTitleCase(msg)
message(msg)
}
we get:
* checking dependencies in R code ... NOTE
'library' or 'require' call to ‘tools’ in package code.
Please use :: or requireNamespace() instead.
See section 'Suggested packages' in the 'Writing R Extensions' manual.…
...
* checking R code for possible problems ... NOTE
hello: no visible global function definition for ‘toTitleCase’
Undefined global functions or variables:
toTitleCase
This should be enough for a package developer to figure out that the
function should be implemented as:
hello <- function() {
msg <- "Hello world!"
if (requireNamespace("tools")) msg <- tools::toTitleCase(msg)
message(msg)
}
which passes 'R CMD check' with all OK.
BTW, is there a reason why one cannot import from the 'base' package?
If we add, say, importFrom(base, mean) to a package's NAMESPACE file,
the package builds fine, but fails to install;
$ R --vanilla CMD INSTALL teeny_0.1.0.tar.gz
* installing to library ‘/home/hb/R/x86_64-pc-linux-gnu-library/4.2-CBI-gcc9’
* installing *source* package ‘teeny’ ...
** using staged installation
** R
** byte-compile and prepare package for lazy loading
Error in asNamespace(ns, base.OK = FALSE) :
operation not allowed on base namespace
Calls: <Anonymous> ... namespaceImportFrom -> importIntoEnv ->
getNamespaceInfo -> asNamespace
Execution halted
ERROR: lazy loading failed for package ‘teeny’
Is this by design (e.g. more flexibility in maintaining the base-R
packages), or due to a technical limitation (e.g. the 'base' package
is not fully loaded at this point)?
/Henrik
On Sat, Sep 17, 2022 at 1:24 PM <luke-tierney using uiowa.edu> wrote:
>
> On Sat, 17 Sep 2022, Kurt Hornik wrote:
>
> >>>>>> luke-tierney writes:
> >
> >> On Thu, 15 Sep 2022, Duncan Murdoch wrote:
> >>> The author of this Stackoverflow question
> >>> https://stackoverflow.com/q/73722496/2554330 got confused because a typo in
> >>> his code didn't trigger an error in normal circumstances, but it did when he
> >>> ran his code in pkgdown.
> >>>
> >>> The typo was to use "x" in a test, when the local variable was named ".x".
> >>> There was no "x" defined locally or in the package or its imports, so the
> >>> search got all the way to the global environment and found one. (The very
> >>> confusing part for this user was that it found the right variable.)
> >>>
> >>> This author had suppressed the "R CMD check" check for use of global
> >>> variables. Obviously he shouldn't have done that, but he's working with
> >>> tidyverse NSE, and that causes so many false positives that it is somewhat
> >>> understandable he would suppress one too many.
> >>>
> >>> The pkgdown simulation of code in examples doesn't do perfect mimicry of
> >>> running it at top level; the fake global environment never makes it onto the
> >>> search list. Some might call this a bug, but I'd call it the right search
> >>> strategy.
> >>>
> >>> My suggestion is that the search for variables in package code should never
> >>> get to globalenv(). The chain of environments should stop after handling the
> >>> imports. (Probably base package functions should also be implicitly
> >>> imported, but nothing else.)
> >>>
> >
> >> This was considered and discussed when I added namespaces. Basically
> >> it would mean making the parent of the base namespace environment be
> >> the empty environment instead of the global environment. As a design
> >> this is cleaner, and it would be a one-line change in eval.c. But
> >> there were technical reasons this was not a viable option at the time,
> >> also a few political reasons. The technical reasons mostly had to do
> >> with S3 dispatch.
> >
> >> Changes over the years, mostly from work Kurt has done, to S3 dispatch
> >> for methods defined and registered in packages might make this more
> >> viable in principle, but there would still be a lot of existing code
> >> that would stop working. For example, 'make check' with the one-line
> >> change fails in a base example that defines an S3 method. It might be
> >> possible to fiddle with the dispatch to keep most of that code
> >> working, but I suspect that would be a lot of work. Seeing what it
> >> would take to get 'make check' to succeed would be a first step if
> >> anyone wants to take a crack at it.
> >
> > Luke,
> >
> > Can you please share the one-line change so that I can take a closer
> > look?
>
> Index: src/main/envir.c
> ===================================================================
> --- src/main/envir.c (revision 82861)
> +++ src/main/envir.c (working copy)
> @@ -683,7 +683,7 @@
> R_GlobalCachePreserve = CONS(R_GlobalCache, R_NilValue);
> R_PreserveObject(R_GlobalCachePreserve);
> #endif
> - R_BaseNamespace = NewEnvironment(R_NilValue, R_NilValue, R_GlobalEnv);
> + R_BaseNamespace = NewEnvironment(R_NilValue, R_NilValue, R_EmptyEnv);
> R_PreserveObject(R_BaseNamespace);
> SET_SYMVALUE(install(".BaseNamespaceEnv"), R_BaseNamespace);
> R_BaseNamespaceName = ScalarString(mkChar("base"));
>
> -----
>
> For S3 the dispatch will have to be changed to explicitly search
> .GlobalEnv and parents after the namespace if we don't want to break
> too much.
>
> Another idiom that will be broken is
>
> if (require("foo"))
> bar(...)
>
> with bar exported from foo. I don't know if that is already warned
> about. Moving away from this is arguably good in principle but also
> probably fairly disruptive. We might need to add some cleaner
> use-if-available mechanism, or maybe just adjust some checking code.
>
> Best,
>
> luke
>
> >
> > Best
> > -k
> >
> >>> I suspect this change would reveal errors in lots of packages, but the number
> >>> of legitimate uses of the current search strategy has got to be pretty small
> >>> nowadays, since we've been getting warnings for years about implicit imports
> >>> from other standard packages.
> >
> >> Your definition of 'legitimate' is probably quite similar to mine, but
> >> there is likely to be a small but vocal minority with very different
> >> views :-).
> >
> >> Best,
> >
> >> luke
> >
> >>> Duncan Murdoch
> >>>
> >>> ______________________________________________
> >>> R-devel using r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>>
> >
> >> --
> >> Luke Tierney
> >> Ralph E. Wareham Professor of Mathematical Sciences
> >> University of Iowa Phone: 319-335-3386
> >> Department of Statistics and Fax: 319-335-3017
> >> Actuarial Science
> >> 241 Schaeffer Hall email: luke-tierney using uiowa.edu
> >> Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu
> >
> >> ______________________________________________
> >> R-devel using r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa Phone: 319-335-3386
> Department of Statistics and Fax: 319-335-3017
> Actuarial Science
> 241 Schaeffer Hall email: luke-tierney using uiowa.edu
> Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list