[R] Split String in regex while Keeping Delimiter
Eric Berger
er|cjberger @end|ng |rom gm@||@com
Wed Apr 12 19:32:53 CEST 2023
This seems to do the job but there are probably more elegant solutions:
f <- function(s) { sub("^ ","",unlist(strsplit(gsub("\\+ ","+@ ",s),"@"))) }
g <- function(s) { sub("^ ","",unlist(strsplit(gsub("- ","-@ ",s),"@"))) }
h <- function(s) { g(f(s)) }
To try it out:
s <- “leucocyten + gramnegatieve staven +++ grampositieve staven ++”
t <- “leucocyten – grampositieve coccen +”
h(s)
h(t)
HTH,
Eric
On Wed, Apr 12, 2023 at 7:56 PM Emily Bakker <emilybakker using outlook.com>
wrote:
> Hello List,
>
> I have a dataset consisting of strings that I want to split while saving
> the delimiter.
>
> Some example data:
> “leucocyten + gramnegatieve staven +++ grampositieve staven ++”
> “leucocyten – grampositieve coccen +”
>
> I want to split the strings such that I get the following result:
> c(“leucocyten +”, “gramnegatieve staven +++”, “grampositieve staven ++”)
> c(“leucocyten –“, “grampositieve coccen +”)
>
> I have tried strsplit with a regular expression with a positive lookahead,
> but I am not able to achieve the results that I want.
>
> I have tried:
> as.list(strsplit(x, split = “(?=[\\+-]{1,3}\\s)+, perl=TRUE)
>
> Which results in:
> c(“leucocyten “, “+”, “gramnegatieve staven “, “+”, “+”, “+”,
> “grampositieve staven ++”)
> c(“leucocyten “, “–“, “grampositieve coccen +”)
>
>
> Is there a function or regular expression that will make this possible?
>
> Kind regards,
> Emily
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list