[R] Find String Between Characters
Sparks, John James
jspark4 at uic.edu
Sun May 15 04:14:14 CEST 2011
Hi Jim,
Thanks for your note.
Unfortunately, when I attempt your solution in my exact setting, I get a
weird and slightly different answer.
First, let me be more clear. What I am attempting to do is pull the CIK
number out of the information from the web page itself after it has loaded
to R (this may not be optimal, but I am new at this), not from the web
page reference (as you have done).
So, when I execute the following as per your suggestion:
require(scrapeR)
mmm<-scrape(url="http://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=0000320193&owner=exclude&count=40")
num <- sub("^.*CIK=([0-9]+).*", "\\1", mmm)
I get
[1] "<pointer: 0x00000000001265c0>"
Is this just a hex representation of the same number, or is something else
going on here?
Comments from any and all would be much appreciated.
--John J. Sparks, Ph.D.
On Sat, May 14, 2011 7:57 pm, jim holtman wrote:
> Is this what you want:
>
>> mmm<-"http://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=0000320193&owner=exclude&count=40"
>> num <- sub("^.*CIK=([0-9]+).*", "\\1", mmm)
>> num
> [1] "0000320193"
>>
>
>
> On Sat, May 14, 2011 at 8:20 PM, Sparks, John James <jspark4 at uic.edu>
> wrote:
>> Dear R Helpers,
>>
>> I am trying to isolate a set of characters between two other characters
>> in
>> a long string file. I tried some of the examples on the R help pages
>> and
>> elsewhere, but I am not able to get it. Your help would be much
>> appreciated.
>>
>> require(scrapeR)
>> mmm<-scrape(url="http://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=0000320193&owner=exclude&count=40")
>> str(mmm)
>>
>> I want to get the number 0000320193 that is between the CIK= and the &.
>> I
>> have tried
>>
>> g <- grep( "CIK=|&", mmm )
>> and
>> temp<-grep(mmm,\CIK=\&)
>>
>> and variations on these themes, but all won't run or come bask as an
>> empty
>> object. How can I grab this number?
>>
>> Best wishes,
>> --John J. Sparks, Ph.D.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
>
>
More information about the R-help
mailing list