[R] Is it odd or not? about I() function
Hiroto Miyoshi
h|roto-m|yo@h| @end|ng |rom e-m@||@jp
Mon Apr 21 11:55:40 CEST 2025
Dear R experts,
Thank you all for your help.
Now, I know that the ggplot is causing the issue and that
setting the y=as.numeric(rslt_with_I) can fix the problem,
that should suffice for now, since I am not an R expert.
In addition, this time, I could learn that there is a class
of "AsIs", and that the ggplot might not expect the use
of "AsIs" class in the data. This is a bit of learning for me.
Thank you, again. Your responses helped me a lot.
Sincerely,
Hiroto
On 2025/04/21 6:24, peter dalgaard wrote:
> What Ben says. Also a bit of pragmatic advice:
>
> If you want people to help you and run your scripts, don't precede each line with 9 characters that they'll have to remove to make it runnable. (Emacs users can do "C-x r k", but there aren't that many Emacs users around these days.)
>
> Also, the script is overly complicated and expects the user to (install and) load a bunch of stuff, where the effect that you are talking about is just as clearly visible in simpler code like this.
>
> ggplot(dta, aes(x = x, y = as.numeric(y)))+geom_line()
> x <- 1:10; y <- rnorm(10) ; dta <- data.frame(x,y)
> ggplot(dta, aes(x = x, y = y )) + geom_line()
> ggplot(dta, aes(x = x, y = I(y) )) + geom_line()
>
> However, like Ben, I'm not quite up to drilling into the ggplot code to see where things go wrong.
>
> Apparently, you can leave out the geom_line() bit and still get the odd y scale, so the issue is inside ggplot() it self and perhaps you could do something like debug(ggplot2:::ggplot_build.ggplot) and single-step and see if you can spot where and how the y scale is being set up.
>
> -pd
>
>
>> On 19 Apr 2025, at 23.15, Ben Bolker <bbolker using gmail.com> wrote:
>>
>> This is obviously not a complete answer, but if you look at the data closely:
>>
>> str(dta)
>> 'data.frame': 40 obs. of 6 variables:
>> $ x : num 0.915 0.937 0.286 0.83 0.642 ...
>> $ y : num 0.3796 0.4358 0.0374 0.9735 0.4318 ...
>> $ z : int 1 1 1 1 1 1 1 1 1 1 ...
>> $ x_axis : int 1 2 3 4 5 6 7 8 9 10 ...
>> $ rslt_without_I: num 0.7657 0.8474 0.0516 1.1533 0.483 ...
>> $ rslt_with_I : 'AsIs' num 0.765658.... 0.847406.... 0.051648.... 1.153295.... 0.482993.... ...
>>
>> you'll see that the two variables have different *classes*. Your '==' test checks to see if the *numeric values* of the elements are the same.
>>
>> Both of these, which check the characteristics of the vector itself as well as the values of the elements, indicate that these vectors are indeed different.
>>
>> identical(dta$rslt_with_I, dta$rslt_without_I)
>> all.equal(dta$rslt_with_I, dta$rslt_without_I)
>>
>> In order to figure out *why* having class "AsIs" rather than class "numeric" makes the axis/breaks computation fail, you'd have to dig into the machinery (or, ask on the ggplot issues list -- the questions there involving "AsIs" mostly refer to a separate use case for "AsIs" ... https://github.com/tidyverse/ggplot2/issues?q=is%3Aissue%20AsIs )
>>
>>
>> On 2025-04-18 9:46 p.m., Hiroto Miyoshi wrote:
>>> Dear R expert
>>> I encountered a bewildering situation, about which
>>> I am seeking help. I wrote a following toy script
>>> which can recreate the situation.
>>> --- the script begins here ---
>>> 1 │ library(tidyverse)
>>> 2 │ library(rlist)
>>> 3 │ library(patchwork)
>>> 4 │ set.seed(42)
>>> 5 │ f <- function(x, y, z, x_axis) {
>>> 6 │ rslt_with_I <- I(x^2 * 0.5) + I(x * y)
>>> 7 │ rslt_without_I <- (x^2 * 0.5) + (x * y)
>>> 8 │ out <- data.frame(rslt_without_I, rslt_with_I)
>>> 9 │ return(out)
>>> 10 │ }
>>> 11 │
>>> 12 │ df <- data.frame(
>>> 13 │ x = runif(40, 0, 1),
>>> 14 │ y = runif(40, 0, 1),
>>> 15 │ z = rep(1:4, rep(10, 4)),
>>> 16 │ x_axis = rep(1:10, 4)
>>> 17 │ )
>>> 18 │
>>> 19 │ dta <- pmap(df, f) %>%
>>> 20 │ list.stack(.) %>%
>>> 21 │ cbind(df, .)
>>> 22 │
>>> 23 │ g1 <- ggplot(dta, aes(x = x_axis, y = rslt_with_I, color =
>>> factor(z))) +
>>> 24 │ geom_point() +
>>> 25 │ geom_line()
>>> 26 │ g2 <- ggplot(dta, aes(x = x_axis, y = rslt_without_I, color =
>>> factor(z))) +
>>> 27 │ geom_point() +
>>> 28 │ geom_line()
>>> 29 │
>>> 30 │ g <- g1 | g2
>>> 31 │ plot(g)
>>> 32 │
>>> 33 │ dta$rslt_with_I == dta$rslt_without_I
>>> 34 │ # the end of the script
>>> The two graphs, i.e. g1 and g2 are different and obviously, the data do not
>>> fit in the graph area for g1. The command "dta$rslt_with_I ==
>>> dta$rslt_without_I"
>>> shows the plotted data are identical. I want why this happens.
>>> Sincerely
>>> Hiroto
>>> [[alternative HTML version deleted]]
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> --
>> Dr. Benjamin Bolker
>> Professor, Mathematics & Statistics and Biology, McMaster University
>> Director, School of Computational Science and Engineering
>>> E-mail is sent at my convenience; I don't expect replies outside of working hours.
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list