[R] split the data.frame
Robert Citek
rwcitek at alum.calberkeley.org
Tue May 16 07:18:52 CEST 2006
On May 15, 2006, at 10:45 PM, YIHSU CHEN wrote:
> I wonder anyone has a elegent way of doing what I need to do.
>
> I have a data frame called with four columns: V1, V2, A1 and A2:
>
> V1 V2 A1 A2
> A B 1.2 2.0
> A D 1.2 4.0
> A C 2.4 2.2
>
> What I need to do is to convert it into the following data frame
> with a new column x, where x is just the stacked up of A1 and A2
> placed with respective V1 and V2 in the first two columns:
>
> V1 V2 x
> A B 1.2
> A B 2.0
> A D 1.2
> A D 4.0
> A C 2.4
> A C 2.2
>
> I wonder whether there is an efficient way to do it since I have
> huge dataset.
How big is huge? Also, what operating system are you using?
If your data set is really big, i.e. bigger than R can handle in
memory, then you might want to write the data frame to disk,
manipulate it there, and then read it back in.
For example:
myDF <- data.frame(V1=rep("A",3), V2=c("B","D","C"), A1=c
(1.2,1.2,2.4), A2=c(2,4,2.2) )
write.table(subset(myDF,select=c(V1,V2,A1)), file="foo.txt",
row.name=FALSE, col.names = FALSE)
write.table(subset(myDF,select=c(V1,V2,A2)), file="foo.txt",
row.name=FALSE, col.names = FALSE, append= TRUE)
newDF <- read.table("foo.txt", col.names=c("V1","V2","x"))
newDF[1:10,]
There's also the operating system solution if using Linux or Cywin/
Windows:
myDF <- data.frame(V1=rep("A",3), V2=c("B","D","C"), A1=c
(1.2,1.2,2.4), A2=c(2,4,2.2) )
write.table(myDF, file="foo.txt", sep="\t", na="",
quote=FALSE, row.names = FALSE, col.names=FALSE)
system("{ cut -f1,2,3 foo.txt ; cut -f1,2,4 foo.txt ; } > bar.txt")
newDF <- read.table("bar.txt", col.names=c("V1","V2","x"))
newDF[1:10,]
Please post back letting us know what worked for you.
Regards,
- Robert
http://www.cwelug.org/downloads
Help others get OpenSource software. Distribute FLOSS
for Windows, Linux, *BSD, and MacOS X with BitTorrent
More information about the R-help
mailing list