[R] Parsing a Simple Chemical Formula
    Bryan Hanson 
    hanson at depauw.edu
       
    Mon Dec 27 00:29:52 CET 2010
    
    
  
Hello R Folks...
I've been looking around the 'net and I see many complex solutions in  
various languages to this question, but I have a pretty simple need  
(and I'm not much good at regex).  I want to use a chemical formula as  
a function argument.  The formula would be in "Hill order" which is to  
list C, then H, then all other elements in alphabetical order.  My  
example will have only a limited number of elements, few enough that  
one can search directly for each element.  So some examples would be  
C5H12, or C5H12O or C5H11BrO (note that for oxygen and bromine, O or  
Br, there is no following number meaning a 1 is implied).
Let's say
 > form <- "C5H11BrO"
I'd like to get the count of each element, so in this case I need to  
extract C and 5, H and 11, Br and 1, O and 1 (I want to calculate the  
molecular weight by mulitplying).  Sounds pretty simple, but my  
experiments with grep and strsplit don't immediately clue me into an  
obvious solution.  As I said, I don't need a general solution to the  
problem of calculating molecular weight from an arbitrary formula,  
that seems quite challenging, just a way to convert "form" into a list  
or data frame which I can then do the math on.
Here's hoping this is a simple issue for more experienced R users!   
TIA,  Bryan
***********
Bryan Hanson
Professor of Chemistry & Biochemistry
    
    
More information about the R-help
mailing list