[R] Output of tapply function as data frame: Problem Fixed
Jeff Newmiller
jdnewm|| @end|ng |rom dcn@d@v|@@c@@u@
Fri Mar 29 04:13:15 CET 2024
I would guess your version of R is earlier than 4.1, when the built-in pipe was introduced to the language
On March 28, 2024 6:43:05 PM PDT, Ogbos Okike <giftedlife2014 using gmail.com> wrote:
>Dear Rui,
>Thanks again for resolving this. I have already started using the version
>that works for me.
>
>But to clarify the second part, please let me paste the what I did and the
>error message:
>
>> set.seed(2024)
>> data <- data.frame(
>+ Date = sample(seq(Sys.Date() - 5, Sys.Date(), by = "1 days"), 100L,
>+ TRUE),
>+ count = sample(10L, 100L, TRUE)
>+ )
>>
>> # coerce tapply's result to class "data.frame"
>> res <- with(data, tapply(count, Date, mean)) |> as.data.frame()
>Error: unexpected '>' in "res <- with(data, tapply(count, Date, mean)) |>"
>> # assign a dates column from the row names
>> res$Date <- row.names(res)
>Error in row.names(res) : object 'res' not found
>> # cosmetics
>> names(res)[2:1] <- names(data)
>Error in names(res)[2:1] <- names(data) : object 'res' not found
>> # note that the row names are still tapply's names vector
>> # and that the columns order is not Date/count. Both are fixed
>> # after the calculations.
>> res
>
>You can see that the error message is on the pipe. Please, let me know
>where I am missing it.
>Thanks.
>
>On Wed, Mar 27, 2024 at 10:45 PM Rui Barradas <ruipbarradas using sapo.pt> wrote:
>
>> Às 08:58 de 27/03/2024, Ogbos Okike escreveu:
>> > Dear Rui,
>> > Nice to hear from you!
>> >
>> > I am sorry for the omission and I have taken note.
>> >
>> > Many thanks for responding. The second solution looks elegant as it
>> quickly
>> > resolved the problem.
>> >
>> > Please, take a second look at the first solution. It refused to run.
>> Looks
>> > as if the pipe is not properly positioned. Efforts to correct it and get
>> it
>> > run failed. If you can look further, it would be great. If time does not
>> > permit, I am fine too.
>> >
>> > But having the too solutions will certainly make the subject more
>> > interesting.
>> > Thank you so much.
>> > With warmest regards from
>> > Ogbos
>> >
>> > On Wed, Mar 27, 2024 at 8:44 AM Rui Barradas <ruipbarradas using sapo.pt>
>> wrote:
>> >
>> >> Às 04:30 de 27/03/2024, Ogbos Okike escreveu:
>> >>> Warm greetings to you all.
>> >>>
>> >>> Using the tapply function below:
>> >>> data<-read.table("FD1month",col.names = c("Dates","count"))
>> >>> x=data$count
>> >>> f<-factor(data$Dates)
>> >>> AB<- tapply(x,f,mean)
>> >>>
>> >>>
>> >>> I made a simple calculation. The result, stored in AB, is of the form
>> >>> below. But an effort to write AB to a file as a data frame fails. When
>> I
>> >>> use the write table, it only produces the count column and strip of the
>> >>> first column (date).
>> >>>
>> >>> 2005-11-01 2005-12-01 2006-01-01 2006-02-01 2006-03-01 2006-04-01
>> >>> 2006-05-01
>> >>> -4.106887 -4.259154 -5.836090 -4.756757 -4.118011 -4.487942
>> >>> -4.430705
>> >>> 2006-06-01 2006-07-01 2006-08-01 2006-09-01 2006-10-01 2006-11-01
>> >>> 2006-12-01
>> >>> -3.856727 -6.067103 -6.418767 -4.383031 -3.985805 -4.768196
>> >>> -10.072579
>> >>> 2007-01-01 2007-02-01 2007-03-01 2007-04-01 2007-05-01 2007-06-01
>> >>> 2007-07-01
>> >>> -5.342338 -4.653128 -4.325094 -4.525373 -4.574783 -3.915600
>> >>> -4.127980
>> >>> 2007-08-01 2007-09-01 2007-10-01 2007-11-01 2007-12-01 2008-01-01
>> >>> 2008-02-01
>> >>> -3.952150 -4.033518 -4.532878 -4.522941 -4.485693 -3.922155
>> >>> -4.183578
>> >>> 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01 2008-08-01
>> >>> 2008-09-01
>> >>> -4.336969 -3.813306 -4.296579 -4.575095 -4.036036 -4.727994
>> >>> -4.347428
>> >>> 2008-10-01 2008-11-01 2008-12-01
>> >>> -4.029918 -4.260326 -4.454224
>> >>>
>> >>> But the normal format I wish to display only appears on the terminal,
>> >>> leading me to copy it and paste into a text file. That is, when I enter
>> >> AB
>> >>> on the terminal, it returns a format in the form:
>> >>>
>> >>> 008-02-01 -4.183578
>> >>> 2008-03-01 -4.336969
>> >>> 2008-04-01 -3.813306
>> >>> 2008-05-01 -4.296579
>> >>> 2008-06-01 -4.575095
>> >>> 2008-07-01 -4.036036
>> >>> 2008-08-01 -4.727994
>> >>> 2008-09-01 -4.347428
>> >>> 2008-10-01 -4.029918
>> >>> 2008-11-01 -4.260326
>> >>> 2008-12-01 -4.454224
>> >>>
>> >>> Now, my question: How do I write out two columns displayed by AB on the
>> >>> terminal to a file?
>> >>>
>> >>> I have tried using AB<-data.frame(AB) but it doesn't work either.
>> >>>
>> >>> Many thanks for your time.
>> >>> Ogbos
>> >>>
>> >>> [[alternative HTML version deleted]]
>> >>>
>> >>> ______________________________________________
>> >>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >>> https://stat.ethz.ch/mailman/listinfo/r-help
>> >>> PLEASE do read the posting guide
>> >> http://www.R-project.org/posting-guide.html
>> >>> and provide commented, minimal, self-contained, reproducible code.
>> >> Hello,
>> >>
>> >> The main trick is to pipe to as.data.frame. But the result will have one
>> >> column only, you must assign the dates from the df's row names.
>> >> I also include an aggregate solution.
>> >>
>> >>
>> >>
>> >> # create a test data set
>> >> set.seed(2024)
>> >> data <- data.frame(
>> >> Date = sample(seq(Sys.Date() - 5, Sys.Date(), by = "1 days"), 100L,
>> >> TRUE),
>> >> count = sample(10L, 100L, TRUE)
>> >> )
>> >>
>> >> # coerce tapply's result to class "data.frame"
>> >> res <- with(data, tapply(count, Date, mean)) |> as.data.frame()
>> >> # assign a dates column from the row names
>> >> res$Date <- row.names(res)
>> >> # cosmetics
>> >> names(res)[2:1] <- names(data)
>> >> # note that the row names are still tapply's names vector
>> >> # and that the columns order is not Date/count. Both are fixed
>> >> # after the calculations.
>> >> res
>> >> #> count Date
>> >> #> 2024-03-22 5.416667 2024-03-22
>> >> #> 2024-03-23 5.500000 2024-03-23
>> >> #> 2024-03-24 6.000000 2024-03-24
>> >> #> 2024-03-25 4.476190 2024-03-25
>> >> #> 2024-03-26 6.538462 2024-03-26
>> >> #> 2024-03-27 5.200000 2024-03-27
>> >>
>> >> # fix the columns' order
>> >> res <- res[2:1]
>> >>
>> >>
>> >>
>> >> # better all in one instruction
>> >> aggregate(count ~ Date, data, mean)
>> >> #> Date count
>> >> #> 1 2024-03-22 5.416667
>> >> #> 2 2024-03-23 5.500000
>> >> #> 3 2024-03-24 6.000000
>> >> #> 4 2024-03-25 4.476190
>> >> #> 5 2024-03-26 6.538462
>> >> #> 6 2024-03-27 5.200000
>> >>
>> >>
>> >>
>> >> Also,
>> >> I'm glad to help as always but Ogbos, you have been an R-Help
>> >> contributor for quite a while, please post data in dput format. Given
>> >> the problem the output of the following is more than enough.
>> >>
>> >>
>> >> dput(head(data, 20L))
>> >>
>> >>
>> >> Hope this helps,
>> >>
>> >> Rui Barradas
>> >>
>> >>
>> >> --
>> >> Este e-mail foi analisado pelo software antivírus AVG para verificar a
>> >> presença de vírus.
>> >> www.avg.com
>> >>
>> >
>> Hello,
>>
>> This pipe?
>>
>>
>> with(data, tapply(count, Date, mean)) |> as.data.frame()
>>
>>
>> I am not seeing anything wrong with it. I have tried it again just now
>> and it runs with no problems, like it had before.
>> A solution is not to pipe, separate the instructions.
>>
>>
>> res <- with(data, tapply(count, Date, mean))
>> res <- as.data.frame(res)
>>
>>
>> But this should be equivalent to the pipe, I cannot think of a way to
>> have this separated instructions run but not the pipe.
>>
>> Hope this helps,
>>
>> Rui Barradas
>>
>>
>> --
>> Este e-mail foi analisado pelo software antivírus AVG para verificar a
>> presença de vírus.
>> www.avg.com
>>
>
> [[alternative HTML version deleted]]
>
>______________________________________________
>R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
--
Sent from my phone. Please excuse my brevity.
More information about the R-help
mailing list