[R] Splitting a character variable into a numeric one and a character one?

Barry Rowlingson B.Rowlingson at lancaster.ac.uk
Mon Sep 25 19:07:13 CEST 2006



>>>Now I want to do an operation that can split it into two variables:
>>>
>>>Column 1        Column 2         Column 3
>>>
>>>"123abc"         123                  "abc"
>>>"12cd34"         12                    "cd34"
>>>"1e23"             1                      "e23"
>>>...
>>>
>>>So basically, I want to split the original variabe into a numeric one and a
>>>character one, while the splitting element is the first character in Column


My first thought on this was to apply the regexp "^([0-9]*)(.*)$" and 
getting the two parts out. But I dont see a way to get both matches in 
parentheses out in one go.

In Python you just do:

  >>> re.findall('^([0-9]*)(.*)$',"123abc")
  [('123', 'abc')]

  >>> re.findall('^([0-9]*)(.*)$',"1e12")
  [('1', 'e12')]

In R you can get the groups and go gsub on them:

  > r="^([0-9]*)(.*)$"
  > gsub(r,"\\1","123abc")
  [1] "123"

  But I dont see a way of getting the two values out except as part of 
one string in gsub - which is right back where you started - or doing 
gsub twice.

Barry



More information about the R-help mailing list