[R] regular expression question
Dirk Eddelbuettel
edd at debian.org
Sun Jun 11 23:58:36 CEST 2006
On 11 June 2006 at 14:35, Jeff Newmiller wrote:
| >>gsub("(\\d*)$","",c("AAL123", "XELB245", "A247", "FOO123BAR"), perl=TRUE)
| >
| > [1] "AAL" "XELB" "A" "FOO123BAR"
| >
| >
| > gsub finds what is described by the first regexp [ here (\\d\*)$ --- any
| > sequence of digits before the end-of-line ] and applies the second regexp
| > [ here an empty string as we simply delete ] to the third argument.
| >
| > Note
| > - how the $ symbol $ \b prevents it from eating the non-final digits
| > in the counter example FOO123BAR
| > - how the \d for digits needs escaped backslashes \\d
| > - how the * char denotes '1 or more of the preceding thingie'
|
| * normally means "zero or more of the preceding thingie"
| + is the "1 or more or the preceding thingie"
|
| The difference would be apparent if the string being inserted was not
| empty.
|
| > gsub("(\\d*)$","new",c("AAL123", "XELB245", "A247", "FOO123BAR"), perl=TRUE)
| [1] "AALnew" "XELBnew" "Anew" "FOO123BARnew"
|
| > gsub("(\\d+)$","new",c("AAL123", "XELB245", "A247", "FOO123BAR"), perl=TRUE)
| [1] "AALnew" "XELBnew" "Anew" "FOO123BAR"
Thanks for catching, and correcting, that.
Dirk
--
Hell, there are no rules here - we're trying to accomplish something.
-- Thomas A. Edison
More information about the R-help
mailing list