[R] Extract

Val v@|kremk @end|ng |rom gm@||@com
Sun Jul 21 18:36:03 CEST 2024


Thank   you Bert!
However, the last line of the script.

dat |> names() |> _[4:8] <- paste0("s", 1:5)

is giving me an error as shown below
Error: pipe placeholder can only be used as a named argument

Thank you!

On Sat, Jul 20, 2024 at 7:41 PM Bert Gunter <bgunter.4567 using gmail.com> wrote:
>
> Val:
> I wanted to add here a base R solution to your problem that I realize
> you can happily ignore. However, in the course of puzzling over how to
> do it using the R native pipe syntax ("|>") , I learned some new stuff
> that I thought others might find useful, and it seemed sensible to
> keep the code with this thread for comparison.
>
>  I want to acknowledge that in the course of my labor, I posted a
> query to R-Help to which Iris Simmons posted a very clever answer that
> I would never have figured out myself and that is used below at the
> end to change a subset of the names of the modified data frame via a
> pipe.
>
> Here's the whole solution starting from your (excellent!) example dat:
>
>    dat <- dat$string |>
>       strsplit(" ") |>
>       sapply(FUN = \(x)c(x, rep(NA, 5 - length(x)))) |>
>       t() |> cbind(dat, ..2 = _)
>
>    ## And Iris's trick for changing a subset of attributes, i.e. the
> "names", in a pipe
>    dat |> names() |> _[4:8] <- paste0("s", 1:5)
>
> ## and here's the result:
> > dat
>   Year Sex          string s1   s2   s3   s4   s5
> 1 2002   F        15 xc Ab 15   xc   Ab <NA> <NA>
> 2 2003   F              14 14 <NA> <NA> <NA> <NA>
> 3 2004   M  18 xb 25 35 21 18   xb   25   35   21
> 4 2005   M           13 25 13   25 <NA> <NA> <NA>
> 5 2006   M 14 ac 256 AV 35 14   ac  256   AV   35
> 6 2007   F              11 11 <NA> <NA> <NA> <NA>
>
> As I noted previously, all columns beyond Sex are character
>
> Cheers,
> Bert
>
>
> On Fri, Jul 19, 2024 at 12:26 PM Val <valkremk using gmail.com> wrote:
> >
> > Thank you Jeff and Bert for your help!
> > The components of the string  could be nixed (i.e,  numeric, character
> > or date). Once that is splitted it would be easy for me to format it
> > accordingly.
> >
> > On Fri, Jul 19, 2024 at 2:10 PM Bert Gunter <bgunter.4567 using gmail.com> wrote:
> > >
> > > I did not look closely at the solutions that you were offered, but
> > > note that you did not specify in your post whether the numbers in your
> > > string were to be character or numeric variables after they are broken
> > > out into their own columns. I believe that they are character in the
> > > solutions, but you should check this. If you want them as numeric,
> > > e.g., for further processing, you will need to convert them. Or
> > > vice-versa.
> > >
> > > Bert
> > >
> > >
> > > On Fri, Jul 19, 2024 at 9:52 AM Val <valkremk using gmail.com> wrote:
> > > >
> > > > Hi All,
> > > >
> > > > I want to extract new variables from a string and add it to the dataframe.
> > > > Sample data is csv file.
> > > >
> > > > dat<-read.csv(text="Year, Sex,string
> > > > 2002,F,15 xc Ab
> > > > 2003,F,14
> > > > 2004,M,18 xb 25 35 21
> > > > 2005,M,13 25
> > > > 2006,M,14 ac 256 AV 35
> > > > 2007,F,11",header=TRUE)
> > > >
> > > > The string column has  a maximum of five variables. Some rows have all
> > > > and others may not have all the five variables. If missing then  fill
> > > > it with NA,
> > > > Desired result is shown below,
> > > >
> > > >
> > > > Year,Sex,string, S1, S2, S3 S4,S5
> > > > 2002,F,15 xc Ab, 15,xc,Ab, NA, NA
> > > > 2003,F,14, 14,NA,NA,NA,NA
> > > > 2004,M,18 xb 25 35 21,18, xb, 25, 35, 21
> > > > 2005,M,13 25,13, 25,NA,NA,NA
> > > > 2006,M,14 ac 256 AV 35, 14, ac, 256, AV, 35
> > > > 2007,F,11, 11,NA,NA,NA,NA
> > > >
> > > > Any help?
> > > > Thank you in advance.
> > > >
> > > > ______________________________________________
> > > > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > > > and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list