[R] Problems understanding use of regular expression (in gsub) for manipulating currency
David Winsemius
dwinsemius at comcast.net
Thu Nov 22 02:36:13 CET 2012
On Nov 21, 2012, at 1:41 PM, Mauricio Cornejo wrote:
> Hello,
>
> After reading help file, various threads on this board, and other online tutorials, I've attempted to use gsub (using Perl-like syntax) to change a currency string into something that can be converted to numeric type using only one regular expression. Can anybody point out my error? Note that
>
>
>> x <- "\"$ 1,200,300,400.50\""
>
> Tried the following in an attempt to arrive at "1200300400.50"
>
>> gsub("(^[\\D]*)(([\\d]*)[,])*([\\d]*[.]*[\\d]*)([\\D]*)", "\\3\\4", x, perl=TRUE)
> [1] "300400.50"
>
> Note that "\d" matches a digit character and "\D" matches a non-digit character.
> Results group "\2" was intentionally omitted from the replacement pattern as it would have included commas.
> gsub("[,\"]", "", gsub("^\\D*(\\d.*)", "\\1",x, perl=TRUE) )
[1] "1200300400.50"
I have my doubts about the "\"..." construction. I suspect it stems from your not understanding the conventaion used in printing escpae characters in R.
--
David Winsemius, MD
Alameda, CA, USA
More information about the R-help
mailing list