[R] R 'function' as "subroutine"

Wed Oct 3 19:22:51 CEST 2007

On 10/3/2007 7:56 AM, (Ted Harding) wrote:
> Hi Folks,
> 
> The question I'm asking, regarding the use of function
> definitions in the context described below, is whether
> there are subtle traps or obscure limitations I should
> watch out for. It is probably a rather naive question...
> 
> Quite often, one has occasion to execute interactively
> a lot of R commands in which, from time to time, one has
> occasion to repeat exactly a sequence of commands which
> one has entered earlier. These commands would only refer
> to variables which have been created at the "top level" of
> the program and which exist at the time the sequence of
> commands is entered.
> 
> So it would be convenient to refer to such a sequence of
> commands as a "named block" -- just give its name, and
> they are executed.
> 
> In my experiments, wrapping the first occurrence of such
> a sequence in a function definition seems to work, e.g.
> the first time they are needed:
> 
> block1 <- function(){
>   sequence of commands that you would have enetered
>   for execution at this point
> }
> block1()
> 
> This first call to block1() seems to work OK, in my tests,
> PROVIDED, of course,
> a) The variables it uses and assigns to exist already;
> b) all internal "<-" assignments are written "<<-".
> Then, of course, the next time that block is needed,
> you can call block1() again.
> 
> But can this usage of function definition give rise
> to problems? R scoping can be a bit tricky! And I
> think I am perhaps being naive ...

Doing this is a bad idea in the long run.  Once you have several of 
these little snippets defined, you'll forget the details of what each of 
them does, and then the "side effects" of writing global variables will 
come back and bite you really badly.

For example, you might have a block that does some calculations and then 
draws a plot.  Then you write another one for different calculations, 
and because time has passed, you have forgotten that they both modify 
the variable "a" (even though "a" in one block has nothing whatsoever to 
do with "a" in the other block).

At a third point in time, you decide to use block 2 to calculate, and 
block 1 to plot:  but you will be left with garbage in "a".  If you're 
lucky, you'll recognize this and redo block 2, but if the values are 
plausible, you might come to the wrong conclusion.

So a very good rule is to avoid side effects when you can.

You'd be better off with block1 returning a list of the important 
results, and then it will only mess up other things if you assign that 
list in the wrong place.

It's definitely a good idea to group common operations into functions, 
but try to keep your functions self-contained.

Duncan Murdoch

> 
> (It is not intended that such blocks of code would include
> function definitions).
> 
> OR: Is there a more "kosher" way to do this kind of thing ... ?
> 
> With thanks,
> Ted.
> 
> --------------------------------------------------------------------
> E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
> Fax-to-email: +44 (0)870 094 0861
> Date: 03-Oct-07                                       Time: 12:56:42
> ------------------------------ XFMail ------------------------------
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.