[R] reading data
arun
smartpink111 at yahoo.com
Sun Feb 17 15:25:26 CET 2013
HI Vera,
No problem. I am cc:ing to r-help.
A.K.
________________________________
From: Vera Costa <veracosta.rt at gmail.com>
To: arun <smartpink111 at yahoo.com>
Sent: Sunday, February 17, 2013 5:44 AM
Subject: Re: reading data
Hi. Thank you. It works now:-)
And yes, I use windows.
Thank you very much.
No dia 17 de Fev de 2013 00:44, "arun" <smartpink111 at yahoo.com> escreveu:
Hi Vera,
>
>Have you tried the suggestion?
>
>Are you using Windows?
>Thanks,
>Arun
>
>
>
>
>
>
>________________________________
>From: Vera Costa <veracosta.rt at gmail.com>
>To: arun <smartpink111 at yahoo.com>
>Sent: Saturday, February 16, 2013 7:10 PM
>Subject: Re: reading data
>
>
>Thank you.
>In mine, I have an error " 'what' must be a character string or a function".
>I need to do equivalent in my system.
>Thank you and sorry one more time.
>No dia 16 de Fev de 2013 23:53, "arun" <smartpink111 at yahoo.com> escreveu:
>
>Hi,
>>You didn't mention what the error message or whether you are reading file names which are not "mmmmm11kk.txt".
>>
>>It is workiing on my system as I run it again.
>>?c() combine values into a vector or list.
>>
>> sessionInfo()
>>R version 2.15.1 (2012-06-22)
>>Platform: x86_64-pc-linux-gnu (64-bit)
>>
>>locale:
>> [1] LC_CTYPE=en_CA.UTF-8 LC_NUMERIC=C
>> [3] LC_TIME=en_CA.UTF-8 LC_COLLATE=en_CA.UTF-8
>> [5] LC_MONETARY=en_CA.UTF-8 LC_MESSAGES=en_CA.UTF-8
>> [7] LC_PAPER=C LC_NAME=C
>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>>[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C
>>
>>attached base packages:
>>[1] stats graphics grDevices utils datasets methods base
>>
>>other attached packages:
>>[1] stringr_0.6.2 reshape2_1.2.2
>>
>>loaded via a namespace (and not attached):
>>[1] plyr_1.8
>>
>>
>>#code
>>
>>
>>res<-do.call(c,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x) {names(x)<-gsub("^(.*)\\/.*","\\1",x); lapply(x,function(y) read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))})) #it seems like one of the rows of your file doesn't have 6 elements, so added fill=TRUE
>> names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>>res2<-split(res,names(res))
>>res3<- lapply(res2,function(x) {names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
>>#result
>>
>>res3
>>#$group_a
>>#$group_a$a1
>> Id M mm x b u k j y p v
>>1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926
>>2 aAAAA 1 2 2263 0.0004000 2 2 AR 4 7640 8926
>>3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA
>>4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926
>>5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496
>>6 AA na 2 1972 0.0007000 11 3 AR 25 509 734
>>
>>$group_a$a2
>> Id M mm x b u k j y p v
>>1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926
>>2 aAAAA 1 2 2263 0.0004000 2 2 AR 4 7640 8926
>>3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA
>>4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926
>>5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496
>>6 AA na 2 1972 0.0007000 11 3 AR 25 509 734
>>
>>$group_a$a3
>> Id M mm x b u k j y p v
>>1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926
>>2 aAAAA 1 2 2263 0.0004000 2 2 AR 4 7640 8926
>>3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA
>>4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926
>>5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496
>>6 AA na 2 1972 0.0007000 11 3 AR 25 509 734
>>
>>
>>$group_b
>>$group_b$b1
>> Id M mm x b u k j y p v
>>1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926
>>2 aAAAA 1 2 2263 0.0004000 2 2 AR 4 7640 8926
>>3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA
>>4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926
>>5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496
>>6 AA na 2 1972 0.0007000 11 3 AR 25 509 734
>>
>>$group_b$b2
>> Id M mm x b u k j y p v
>>1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926
>>2 aAAAA 1 2 2263 0.0004000 2 2 AR 4 7640 8926
>>3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA
>>4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926
>>5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496
>>6 AA na 2 1972 0.0007000 11 3 AR 25 509 734
>>
>>
>>$group_c
>>$group_c$c1
>> Id M mm x b u k j y p v
>>1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926
>>2 aAAAA 1 2 2263 0.0004000 2 2 AR 4 7640 8926
>>3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA
>>4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926
>>5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496
>>6 AA na 2 1972 0.0007000 11 3 AR 25 509 734
>>
>>
>>A.K.
>>
>>
>>
>>________________________________
>>From: Vera Costa <veracosta.rt at gmail.com>
>>To: arun <smartpink111 at yahoo.com>
>>Sent: Saturday, February 16, 2013 6:32 PM
>>Subject: Re: reading data
>>
>>
>>Sorry again... In:
>>res<-do.call(c,lapply(list.files(recursive=T)[grep("...
>>What is this c? In do.call(c, When I put this row im R, I have an error.
>>Thank you
>>No dia 15 de Fev de 2013 18:11, "arun" <smartpink111 at yahoo.com> escreveu:
>>
>>Hi,
>>>No problem.
>>>
>>>BTW, these questions are not stupid..
>>>Arun
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>________________________________
>>>From: Vera Costa <veracosta.rt at gmail.com>
>>>To: arun <smartpink111 at yahoo.com>
>>>Sent: Friday, February 15, 2013 1:08 PM
>>>Subject: Re: reading data
>>>
>>>
>>>Thank you very much.
>>>
>>>I will try to apply and after I tell you if it is ok :-)
>>>
>>>Thank you and sorry about this questions (sometimes stupid questions).
>>>
>>>
>>>
>>>
>>>2013/2/15 arun <smartpink111 at yahoo.com>
>>>
>>>HI,
>>>>No problem.
>>>>?c() for concatenate to vector or list().
>>>>If I use do.call(cbind,..) or do.call(rbind,...)
>>>>
>>>>do.call(cbind,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x) {names(x)<-gsub("^(.*)\\/.*","\\1",x); lapply(x,function(y) read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))
>>>># [,1] [,2] [,3] [,4] [,5] [,6]
>>>>#a1 List,11 List,11 List,11 List,11 List,11 List,11
>>>>
>>>>
>>>> do.call(rbind,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x) {names(x)<-gsub("^(.*)\\/.*","\\1",x); lapply(x,function(y) read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))
>>>># a1
>>>>#[1,] List,11
>>>>#[2,] List,11
>>>>#[3,] List,11
>>>>#[4,] List,11
>>>>#[5,] List,11
>>>>#[6,] List,11
>>>>ie.
>>>>list within in a list
>>>>
>>>> restrial<-lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x) {names(x)<-gsub("^(.*)\\/.*","\\1",x); lapply(x,function(y) read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))})
>>>> str(restrial)
>>>>#List of 6
>>>># $ :List of 1
>>>> #..$ a1:'data.frame': 6 obs. of 11 variables:
>>>> .#. ..$ Id: chr [1:6] "aAA" "aAAAA" "aA" "aAA" ...
>>>> #.. ..$ M : chr [1:6] "1" "1" "2" "1" ...
>>>> #. ..$ mm: int [1:6] 2 2 1 2 3 2
>>>> #. ..$ x : int [1:6] 739 2263 1 1965 3660 1972
>>>> -----------------------------------------------------------------
>>>>str(res)
>>>>#List of 6
>>>># $ a1:'data.frame': 6 obs. of 11 variables:
>>>> # ..$ Id: chr [1:6] "aAA" "aAAAA" "aA" "aAA" ...
>>>> #..$ M : chr [1:6] "1" "1" "2" "1" ...
>>>> # ..$ mm: int [1:6] 2 2 1 2 3 2
>>>> # ..$ x : int [1:6] 739 2263 1 1965 3660 1972
>>>>-----------------------------------------------------------------
>>>>
>>>>You mentioned about naming this to "group_a","group_b". etc..
>>>>
>>>> names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>>>>res2<-split(res,names(res))
>>>>
>>>>res3<- lapply(res2,function(x) {names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
>>>> res3$group_a
>>>>$a1
>>>>
>>>># Id M mm x b u k j y p v
>>>>#1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926
>>>>#2 aAAAA 1 2 2263 0.0004000 2 2 AR 4 7640 8926
>>>>#3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA
>>>>#4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926
>>>>#5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496
>>>>#6 AA na 2 1972 0.0007000 11 3 AR 25 509 734
>>>>
>>>>#$a2
>>>>
>>>># Id M mm x b u k j y p v
>>>>#1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926
>>>>#2 aAAAA 1 2 2263 0.0004000 2 2 AR 4 7640 8926
>>>>#3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA
>>>>#4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926
>>>>#5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496
>>>>#6 AA na 2 1972 0.0007000 11 3 AR 25 509 734
>>>>
>>>>#$a3
>>>>
>>>> # Id M mm x b u k j y p v
>>>>#1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926
>>>>#2 aAAAA 1 2 2263 0.0004000 2 2 AR 4 7640 8926
>>>>#3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA
>>>>#4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926
>>>>#5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496
>>>>#6 AA na 2 1972 0.0007000 11 3 AR 25 509 734
>>>>A.K.
>>>>
>>>>________________________________
>>>>From: Vera Costa <veracosta.rt at gmail.com>
>>>>To: arun <smartpink111 at yahoo.com>
>>>>Sent: Friday, February 15, 2013 12:39 PM
>>>>Subject: Re: reading data
>>>>
>>>>
>>>>
>>>>Thank you very much and sorry my questions.
>>>>
>>>>But this code isn't grouping for letters sure? I mean, a1,a2,a3 is the same group, (the first letter give me the name of the group)
>>>>
>>>>Another question, in do.call, you did do.call (c,.....) .What is c?
>>>>
>>>>Sorry
>>>>
>>>>
>>>>
>>>>2013/2/15 arun <smartpink111 at yahoo.com>
>>>>
>>>>HI,
>>>>>
>>>>>Just to add:
>>>>>
>>>>>
>>>>>res<-do.call(c,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x) {names(x)<-gsub("^(.*)\\/.*","\\1",x); lapply(x,function(y) read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))})) #it seems like one of the rows of your file doesn't have 6 elements, so added fill=TRUE
>>>>>
>>>>> names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>>>>>res[grep("group_b",names(res))]
>>>>>
>>>>>I am not sure how you want the grouped data to look like. If you want something like this:
>>>>>res1<-do.call(rbind,res)
>>>>>res2<-lapply(split(res1,gsub("[.0-9]","",row.names(res1))),function(x) {row.names(x)<-1:nrow(x);x})
>>>>>res2
>>>>>#$group_a
>>>>>
>>>>> # Id M mm x b u k j y p v
>>>>>#1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926
>>>>>#2 aAAAA 1 2 2263 0.0004000 2 2 AR 4 7640 8926
>>>>>#3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA
>>>>>#4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926
>>>>>#5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496
>>>>>#6 AA na 2 1972 0.0007000 11 3 AR 25 509 734
>>>>>#7 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926
>>>>>#8 aAAAA 1 2 2263 0.0004000 2 2 AR 4 7640 8926
>>>>>#9 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA
>>>>>#10 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926
>>>>>#11 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496
>>>>>#12 AA na 2 1972 0.0007000 11 3 AR 25 509 734
>>>>>#13 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926
>>>>>#14 aAAAA 1 2 2263 0.0004000 2 2 AR 4 7640 8926
>>>>>#15 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA
>>>>>#16 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926
>>>>>#17 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496
>>>>>#18 AA na 2 1972 0.0007000 11 3 AR 25 509 734
>>>>>
>>>>>
>>>>>#$group_b
>>>>> # Id M mm x b u k j y p v
>>>>>#1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926
>>>>>#2 aAAAA 1 2 2263 0.0004000 2 2 AR 4 7640 8926
>>>>>#3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA
>>>>>#4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926
>>>>>#5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496
>>>>>#6 AA na 2 1972 0.0007000 11 3 AR 25 509 734
>>>>>#7 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926
>>>>>#8 aAAAA 1 2 2263 0.0004000 2 2 AR 4 7640 8926
>>>>>#9 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA
>>>>>#10 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926
>>>>>#11 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496
>>>>>#12 AA na 2 1972 0.0007000 11 3 AR 25 509 734
>>>>>
>>>>>#$group_c
>>>>>
>>>>> # Id M mm x b u k j y p v
>>>>>#1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926
>>>>>#2 aAAAA 1 2 2263 0.0004000 2 2 AR 4 7640 8926
>>>>>#3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA
>>>>>#4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926
>>>>>#5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496
>>>>>#6 AA na 2 1972 0.0007000 11 3 AR 25 509 734
>>>>>
>>>>>
>>>>>#or if you want it like this:
>>>>>res2<-split(res,names(res))
>>>>>
>>>>>res2[["group_b"]]
>>>>>
>>>>>#$group_b
>>>>># Id M mm x b u k j y p v
>>>>>#1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926
>>>>>#2 aAAAA 1 2 2263 0.0004000 2 2 AR 4 7640 8926
>>>>>#3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA
>>>>>#4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926
>>>>>#5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496
>>>>>#6 AA na 2 1972 0.0007000 11 3 AR 25 509 734
>>>>>
>>>>>#$group_b
>>>>> # Id M mm x b u k j y p v
>>>>>#1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926
>>>>>#2 aAAAA 1 2 2263 0.0004000 2 2 AR 4 7640 8926
>>>>>#3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA
>>>>>#4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926
>>>>>#5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496
>>>>>#6 AA na 2 1972 0.0007000 11 3 AR 25 509 734
>>>>>
>>>>>Hope this helps.
>>>>>
>>>>>A.K.
>>>>>
>>>>>
>>>>>
>>>>>----- Original Message -----
>>>>>From: "veracosta.rt at gmail.com" <veracosta.rt at gmail.com>
>>>>>To: smartpink111 at yahoo.com
>>>>>Cc:
>>>>>Sent: Friday, February 15, 2013 9:15 AM
>>>>>Subject: reading data
>>>>>
>>>>>Hi,
>>>>>I post yesterday and you helped me. I have little problem.
>>>>>
>>>>>At first, I never worked with regular expressions...
>>>>>
>>>>>The code that you gave me it's ok, but my files are inside the folders a1,a2,a3. I try to explain better.
>>>>>
>>>>>I have one folder named "data". Inside this folder I have some other folders named "a1","a2","b1",b2",...and inside of each one of that I have some files. I want only the file "mmmmmm.txt" (in all folders I have One file with this name).
>>>>>The name of the folder give me the name of the group,but I need to read the file inside. And after, have "group_a", group_"b"...because I need to work with this data grouped (and know the name of the group).
>>>>>
>>>>>Thank you.
>>>>>
>>>>
>>>
>>
>
More information about the R-help
mailing list