[R] Pipe operator

@vi@e@gross m@iii@g oii gm@ii@com @vi@e@gross m@iii@g oii gm@ii@com
Wed Jan 4 01:44:46 CET 2023


Boris,

There are MANY variations possible and yours does not seem that common or
useful albeit perfectly useful.

I am not talking about making it a one-liner, albeit I find the multi-line
version more useful.

The pipeline concept seems sort of atomic in the following sense. R allows
several in-line variants of assignment besides something like:

    Assign("string", value)

And, variations on the above that are more useful when making multiple
assignments in a loop or using other environments.

What is more common is:

    Name <- Expression

And of course occasionally:

    Expression -> Name

So back to pipelines, you have two perfectly valid ways to do a pipeline and
assign the result. I showed a version like:

Name <-
    Variable |>
    Pipeline.item(...) |>
    ... |>
    Pipeline.item(...)


But you can equally well assign it at the end:

    Variable |>
    Pipeline.item(...) |>
    ... |>
    Pipeline.item(...) -> Name


I think a more valid use of assign is in mid-pipeline as one way to save an
intermediate result in a variable or perhaps in another environment, such as
may be useful when debugging:

Name <-
    Variable |>
    Pipeline.item(...) |>
    assign("temp1", _) |>
    ... |>
    Pipeline.item(...)

This works because assign(), like print() also returns a copy of the
argument that can be passed along the pipeline and thus captured for a side
effect. When done debugging, removing some lines makes it continue working
seamlessly.

BTW, your example does something I am not sure you intended:

  x |> cos() |> max(pi/4) |> round(3) |> assign("x", value = _)

I prefer showing it like this:

     x |> 
    cos() |> 
    max(pi/4) |> 
    round(3) |> 
    assign("x", value = _)

Did you notice you changed "x" by assigning a new value to the one you
started with? That is perfectly legal but may not have been intended.

And, yes, for completeness, there are two more assignment operators I
generally have no use for of <<- and ->> that work in a global sense.

And for even more completeness you can also use the operators above like
this:

> z = `<-`("x", 7)
> z
[1] 7
> x
[1] 7

For even more completeness, the example we are using can use the above
notation with a silly twist. Placing the results in z instead, I find the
new pipe INSISTS _ can only be used with a named argument. Duh, `<-` does
not have named arguments, just positional. So I see any valid name is just
ignored and the following works!

x |> cos() |> max(pi/4) |> round(3) |> `<-`("z", any.identifier = _)

And, frankly, many functions that need the pipe to feed a second or later
position can easily be changed to use the first argument. If you feel the
need to use "assign" make this function before using the pipeline:

assignyx <- function(x, y) assign(y, x)

Then your code can save a variable without an underscore and keyword:

x |> cos() |> max(pi/4) |> round(3) |> assignyx("x")

Or use the new lambda function somewhat designed for this case use which I
find a bit ugly but it is a matter of taste.

But to end this, there is no reason to make things complex in situations
like this. Just use a simple assignment pre or post as meets your needs.





-----Original Message-----
From: Boris Steipe <boris.steipe using utoronto.ca> 
Sent: Tuesday, January 3, 2023 2:01 PM
To: R-help Mailing List <r-help using r-project.org>
Cc: avi.e.gross using gmail.com
Subject: Re: [R] Pipe operator

Working off Avi's example - would:

  x |> cos() |> max(pi/4) |> round(3) |> assign("x", value = _)

...be even more intuitive to read? Or are there hidden problems with that?



Cheers,
Boris


> On 2023-01-03, at 12:40, avi.e.gross using gmail.com wrote:
> 
> John,
> 
> The topic has indeed been discussed here endlessly but new people 
> still stumble upon it.
> 
> Until recently, the formal R language did not have a built-in pipe 
> functionality. It was widely used through an assortment of packages 
> and there are quite a few variations on the theme including different 
> implementations.
> 
> Most existing code does use the operator %>% but there is now a 
> built-in |> operator that is generally faster but is not as easy to use in
a few cases.
> 
> Please forget the use of the word FILE here. Pipes are a form of 
> syntactic sugar that generally is about the FIRST argument to a 
> function. They are NOT meant to be used just for the trivial case you 
> mention where indeed there is an easy way to do things. Yes, they work 
> in such situations. But consider a deeply nested expression like this:
> 
> Result <- round(max(cos(x), 3.14159/4), 3)
> 
> There are MANY deeper nested expressions like this commonly used. The 
> above can be written linearly as in
> 
> Temp1 <- cos(x)
> Temp2 <- max(Temp1, 3.14159/4)
> Result <- round(Temp2, 3)
> 
> Translation, for some variable x, calculate the cosine and take the 
> maximum value of it as compared to pi/4 and round the result to three 
> decimal places. Not an uncommon kind of thing to do and sometimes you 
> can nest such things many layers deep and get hopelessly confused if 
> not done somewhat linearly.
> 
> What pipes allow is to write this closer to the second way while not 
> seeing or keeping any temporary variables around. The goal is to 
> replace the FIRST argument to a function with whatever resulted as the 
> value of the previous expression. That is often a vector or data.frame 
> or list or any kind of object but can also be fairly complex as in a list
of lists of matrices.
> 
> So you can still start with cos(x) OR you can write this where the x 
> is removed from within and leaves cos() empty:
> 
> x %>% cos
> or
> x |> cos()
> 
> In the previous version of pipes the parentheses after cos() are 
> optional if there are no additional arguments but the new pipe requires
them.
> 
> So continuing the above, using multiple lines, the pipe looks like:
> 
> Result <-
>  x %>%
>  cos() %>%
>  max(3.14159/4) %>%
>  round(3)
> 
> This gives the same result but is arguably easier for some to read and 
> follow. Nobody forces you to use it and for simple cases, most people
don't.
> 
> There is a grouping of packages called the tidyverse that makes heavy 
> use of pipes routine as they made most or all their functions such 
> that the first argument is the one normally piped to and it can be 
> very handy to write code that says, read in your data into a variable 
> (a data.frame or tibble often) and PIPE IT to a function that renames 
> some columns and PIPE the resulting modified object to a function that 
> retains only selected rows and pipe that to a function that drops some 
> of the columns and pipe that to a function that groups the items or 
> sorts them and pipe that to a function that does a join with another
object or generates a report or so many other things.
> 
> So the real answer is that piping is another WAY of doing things from 
> a programmers perspective. Underneath it all, it is mostly syntactic 
> sugar and the interpreter rearranges your code and performs the steps 
> in what seems like a different order at times. Generally, you do not need
to care.
> 
> 
> 
> -----Original Message-----
> From: R-help <r-help-bounces using r-project.org> On Behalf Of Sorkin, John
> Sent: Tuesday, January 3, 2023 11:49 AM
> To: 'R-help Mailing List' <r-help using r-project.org>
> Subject: [R] Pipe operator
> 
> I am trying to understand the reason for existence of the pipe 
> operator, %>%, and when one should use it. It is my understanding that 
> the operator sends the file to the left of the operator to the 
> function immediately to the right of the operator:
> 
> c(1:10) %>% mean results in a value of 5.5 which is exactly the same 
> as the result one obtains using the mean function directly, viz.
mean(c(1:10)).
> What is the reason for having two syntactically different but 
> semantically identical ways to call a function? Is one more efficient than
the other?
> Does one use less memory than the other? 
> 
> P.S. Please forgive what might seem to be a question with an obvious
answer.
> I am a programmer dinosaur. I have been programming for more than 50
years.
> When I started programming in the 1960s the only pipe one spoke about 
> was a bong.
> 
> John
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see 
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see 
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


--
Boris Steipe MD, PhD

Professor em.
Department of Biochemistry
Temerty Faculty of Medicine
University of Toronto



More information about the R-help mailing list