[R] regular expression question
Romain Francois
romain.francois at dbmail.com
Tue Mar 3 10:18:27 CET 2009
Wacek Kusnierczyk wrote:
> markleeds at verizon.net wrote:
>
>> can someone show me how to use a regular expression to break the
>> string at the bottom up into its three components :
>>
>> (-0.791,-0.263]
>> (-38,-1.24]
>> (0.96,2.43]
>>
>> I tried to use strplit because of my regexpitis ( it's not curable.
>> i've been to many doctors all over NYC. they tell me there's no cure
>> ) but it doesn't work because there also dots inside the brackets.
>> Thanks.
>>
>> (-0.791,-0.263].(-38,-1.24].(0.96,2.43]
>>
>>
>
> here's one way to get a matrix of numeric values:
>
> text = "(-0.791,-0.263].(-38,-1.24].(0.96,2.43]"
> values = matrix(ncol=2, byrow=TRUE,
> as.numeric(
> grep(pattern='.', value=TRUE,
> x=strsplit(x=text, split=']\\.\\(|\\(|]|,')[[1]])))
>
> modify any of the steps according to your needs.
>
> vQ
>
Here is another way with the gsubfn package:
> require( gsubfn )
> strapply( text, "\\(.*?,.*?]", perl = T )[[1]]
1] "(-0.791,-0.263]" "(-38,-1.24]" "(0.96,2.43]"
Note that gregexpr would also help you here:
> g <- gregexpr( "\\(.*?,.*?]", text, perl = T )[[1]]
> g
[1] 1 17 29
attr(,"match.length")
[1] 15 11 11
But there is always the missing part of extracting the match from the
result of (g)regexpr
> substring( text, g, g + attr(g, "match.length" ) - 1 )
[1] "(-0.791,-0.263]" "(-38,-1.24]" "(0.96,2.43]"
Romain
--
Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
More information about the R-help
mailing list