[R] About Creating a List by Parsing Text
Jim Holtman
jholtman at gmail.com
Tue Aug 5 13:34:24 CEST 2008
You need to read in the line with read.table to parse the string. The
solution assumes a dataframe .
Sent from my iPhone
On Aug 5, 2008, at 7:16, "Henrique Dallazuanna" <wwwhsd at gmail.com>
wrote:
> I think that you need:
>
> p <- scan(textConnection(p), what = "")
>
> On Tue, Aug 5, 2008 at 7:41 AM, Gundala Viswanath
> <gundalav at gmail.com> wrote:
>> Thanks Jim,
>>
>> But how can I modify this line of yours
>>
>> y <- lapply(split(x, x$V3), "[[", 8)
>>
>> to suit my 'comp.ll'
>>
>> I tried this but fail:
>>> p <- "\tGene 11340 211952_at RANBP5 k= 1 LL= -970.692 "
>>> y <- lapply(split(p, p[3]), "[[", 8)
>>> y
>>> list()
>>
>>
>> - Gundala Viswanath
>> Jakarta - Indonesia
>>
>>
>>
>> On Tue, Aug 5, 2008 at 7:14 PM, jim holtman <jholtman at gmail.com>
>> wrote:
>>> Does this get you close to what you want:
>>>
>>>> x <- read.table(textConnection("Gene 11340 211952_at RANBP5 k=
>>>> 1 LL= -970.692
>>> + Gene 11340 211952_at RANBP5 k= 2 LL= -965.35
>>> + Gene 11340 211952_at RANBP5 k= 3 LL= -963.669
>>> + Gene 12682 213301_x_at TRIM24 k= 1 LL= -948.527
>>> + Gene 12682 213301_x_at TRIM24 k= 2 LL= -947.275
>>> + Gene 12682 213301_x_at TRIM24 k= 3 LL= -947.379
>>> + Gene 13764 214385_s_at AI521646 k= 1 LL= -827.86
>>> + Gene 13764 214385_s_at AI521646 k= 2 LL= -777.756
>>> + Gene 13764 214385_s_at AI521646 k= 3 LL= -812.083 "))
>>>> y <- lapply(split(x, x$V3), "[[", 8)
>>>>
>>>> y
>>> $`211952_at`
>>> [1] -970.692 -965.350 -963.669
>>>
>>> $`213301_x_at`
>>> [1] -948.527 -947.275 -947.379
>>>
>>> $`214385_s_at`
>>> [1] -827.860 -777.756 -812.083
>>>
>>>
>>>
>>> On Tue, Aug 5, 2008 at 3:09 AM, Gundala Viswanath <gundalav at gmail.com
>>> > wrote:
>>>> Hi all,
>>>>
>>>> I have the following data in which I want to parse and
>>>> store them in a list
>>>>
>>>> __DATA__
>>>>> print(comp.ll)
>>>> [1] "\tGene 11340 211952_at RANBP5 k= 1 LL= -970.692 "
>>>> [2] "\tGene 11340 211952_at RANBP5 k= 2 LL= -965.35 "
>>>> [3] "\tGene 11340 211952_at RANBP5 k= 3 LL= -963.669 "
>>>> [4] "\tGene 12682 213301_x_at TRIM24 k= 1 LL= -948.527 "
>>>> [5] "\tGene 12682 213301_x_at TRIM24 k= 2 LL= -947.275 "
>>>> [6] "\tGene 12682 213301_x_at TRIM24 k= 3 LL= -947.379 "
>>>> [7] "\tGene 13764 214385_s_at AI521646 k= 1 LL= -827.86 "
>>>> [8] "\tGene 13764 214385_s_at AI521646 k= 2 LL= -777.756 "
>>>> [9] "\tGene 13764 214385_s_at AI521646 k= 3 LL= -812.083 "
>>>> __END__
>>>>
>>>> I expect to get this kind of data structure:
>>>>
>>>>> wanted_output
>>>>
>>>> [['211952_at']]
>>>> $ll.list
>>>> [1] -970.692 -965.35 -963.669
>>>>
>>>> [['213301_x_at']]
>>>> $ll.list
>>>> [1] -948.527 -947.275 -947.379
>>>>
>>>> etc.
>>>>
>>>> How can I achieve that?
>>>>
>>>> I am stuck with the following construct
>>>>
>>>> __BEGIN__
>>>> comp.ll <- model_all[grep("Gene .* k=.*", model_all)]
>>>> print(comp.ll)
>>>>
>>>> patt <- "Gene \\d+ ([\\w-/]+) [\\w-]+ k= (\\d) LL= ([-]\\d+\.\
>>>> \d+)"
>>>> nresk <- unlist(strsplit(sub(patt, "\\1 \\2 \
>>>> \3",comp.ll,perl=TRUE)," "))
>>>> __END__
>>>>
>>>>
>>>> - Gundala Viswanath
>>>> Jakarta - Indonesia
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>>
>>>
>>> --
>>> Jim Holtman
>>> Cincinnati, OH
>>> +1 513 646 9390
>>>
>>> What is the problem that you are trying to solve?
>>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Henrique Dallazuanna
> Curitiba-Paraná-Brasil
> 25° 25' 40" S 49° 16' 22" O
More information about the R-help
mailing list