[R] help with regexpr in gsub

Kimpel, Mark William mkimpel at iupui.edu
Thu Jan 18 01:26:32 CET 2007


I have a very long vector of character strings of the format
"GO:0008104.ISS" and need to strip off the dot and anything that follows
it. There are always 10 characters before the dot. The actual characters
and the number of them after the dot is variable.

So, I would like to return in the format "GO:0008104" . I could do this
with substr and loop over the entire vector, but I thought there might
be a more elegant (and faster) way to do this.

I have tried gsub using regular expressions without success. The code 

gsub(pattern= "\.*?" , replacement="", x=character.vector)

correctly locates the positions in the vector that contain the dot, but
replaces all of the strings with "". Obviously not what I want. Is there
a regular expression for replacement that would accomplish what I want?

Or, does R have a better way to do this?

Thanks,

Mark

Mark W. Kimpel MD 

 

(317) 490-5129 Work, & Mobile

 

(317) 663-0513 Home (no voice mail please)

1-(317)-536-2730 FAX



More information about the R-help mailing list