[R] Any way to add to data frame saved as .rData file?
Ken Termiso
jerk_alert at hotmail.com
Mon Oct 24 18:29:08 CEST 2005
thx everyone for your help...for simplicity, i elected to stay with a text
file and transpose it so that each new row of data is really a column...in
this transposed file, the header is really the row labels. the first cell
has the name of the row labels ("RowID" in this case)...
here's code for what i ended up doing, in case anyone wants it (or wants to
improve it) :
outfile <- mydata.txt
zz <- file(outfile, "w")
rowlabels <- c(1:10000)
cat(c("RowID", rowlabels, "\n"), file = zz, sep = "\t") # make the first
row of the file have the row labels
grep_text <- function(s) # 's' is a unique string that is contained in the
col or cols that you want
{
temp_header <- scan(file = outfile, what = list("RowID"), flush = TRUE)
temp_header <- unlist(temp_header)
g <- grep(toString(s), temp_header) # gives the row number in outfile with
the data you want
if(length(g)==1)
{
temp_file <- scan(file = outfile, what = character(), skip = g-1, nlines =
1) # temp_file = a vector
temp_file <- temp_file[2:length(temp_file)] # drop title
temp_file <- as.numeric(temp_file) # now this is num vector
tf_df <- as.data.frame(temp_file)
}
if(length(g)>1)
{
for(i in 1:length(g))
{
temp_file <- scan(file = outfile, what = character(), skip = g[i]-1,
nlines = 1)
temp_file <- temp_file[2:length(temp_file)] # drop title
temp_file <- as.numeric(temp_file) # now this is num vector
if(i==1)
{
tf_df <- as.data.frame(temp_file)
}
if(i!=1)
{
tf_df[i] <- temp_file
}
}
}
return(tf_df)
}
you would use grep_text(s) to return a data frame with column titles
contained in the string s...if i had a column named "Year05_population" in
the "mydata.txt" file, to return a data frame named 'df' with only that one
column titles "Year05_population" i would simply type :
outfile <- mydata.txt
df <- grep_text("Year05_population")
>From: "Greg Snow" <greg.snow at ihc.com>
>To: jerk_alert at hotmail.com,murdoch at stats.uwo.ca
>CC: gunter.berton at gene.com,r-help at stat.math.ethz.ch
>Subject: Re: [R] Any way to add to data frame saved as .rData file?
>Date: Thu, 13 Oct 2005 12:53:10 -0600
>
>Have you looked at the g.data package? It might be useful
>(but may still require some redesign of your dataset).
>
>Greg Snow, Ph.D.
>Statistical Data Center, LDS Hospital
>Intermountain Health Care
>greg.snow at ihc.com
>(801) 408-8111
>
> >>> "Ken Termiso" <jerk_alert at hotmail.com> 10/13/05 08:14AM >>>
>
> >
> >I'd put the extra columns in their own data frame, and save that to
>disk
> >(use dates/times/process ids or some other unique identifier in the
> >filenames to distinguish them). When you need access to a mixture of
>
> >columns, load (or source, depending how you did the save) the columns
>you
> >need, and cbind them together into one big data frame.
> >
> >If you are concerned about memory requirements when producing the
>pieces,
> >watch out that you don't write out so much data that you'll never have
>
> >enough memory to load all you need at once.
> >
> >Duncan Murdoch
>
>
>hmm...maybe i should just be dumping to a text file instead of a data
>frame..is there any way (without using a real SQL database) in R to
>create a
>file that i can selectively load certain columns from?
>
>if not, maybe i should break the data frame up into pieces (as you
>suggested) and create a separate file that keeps track of which columns
>are
>stored in which files (like a hashtable) and just load the small file
>of
>keys each time i need to load something..
>
>whaddya think??
>
>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide!
>http://www.R-project.org/posting-guide.html
>
More information about the R-help
mailing list