[R] Any way to add to data frame saved as .rData file?

Ken Termiso jerk_alert at hotmail.com
Mon Oct 24 18:29:08 CEST 2005


thx everyone for your help...for simplicity, i elected to stay with a text 
file and transpose it so that each new row of data is really a column...in 
this transposed file, the header is really the row labels. the first cell 
has the name of the row labels ("RowID" in this case)...

here's code for what i ended up doing, in case anyone wants it (or wants to 
improve it) :


outfile <- mydata.txt

zz <- file(outfile, "w")

rowlabels <- c(1:10000)

cat(c("RowID", rowlabels, "\n"), file = zz, sep = "\t")   # make the first 
row of the file have the row labels

grep_text <- function(s)   # 's' is a unique string that is contained in the 
col or cols that you want
{
	temp_header <- scan(file = outfile, what = list("RowID"), flush = TRUE)
	temp_header <- unlist(temp_header)
	g <- grep(toString(s), temp_header)  # gives the row number in outfile with 
the data you want

	if(length(g)==1)
	{
		temp_file <- scan(file = outfile, what = character(), skip = g-1, nlines = 
1)  # temp_file = a vector
		temp_file <- temp_file[2:length(temp_file)]  # drop title
		temp_file <- as.numeric(temp_file)  # now this is num vector
		tf_df <- as.data.frame(temp_file)
	}

	if(length(g)>1)
	{
		for(i in 1:length(g))
		{
			temp_file <- scan(file = outfile, what = character(), skip = g[i]-1, 
nlines = 1)
			temp_file <- temp_file[2:length(temp_file)]  # drop title
			temp_file <- as.numeric(temp_file)  # now this is num vector

			if(i==1)
			{
				tf_df <- as.data.frame(temp_file)
			}

			if(i!=1)
			{
				tf_df[i] <- temp_file
			}
		}
	}

	return(tf_df)
}


you would use grep_text(s) to return a data frame with column titles 
contained in the string s...if i had a column named "Year05_population" in 
the "mydata.txt" file, to return a data frame named 'df' with only that one 
column titles "Year05_population" i would simply type :

outfile <- mydata.txt
df <- grep_text("Year05_population")



>From: "Greg Snow" <greg.snow at ihc.com>
>To: jerk_alert at hotmail.com,murdoch at stats.uwo.ca
>CC: gunter.berton at gene.com,r-help at stat.math.ethz.ch
>Subject: Re: [R] Any way to add to data frame saved as .rData file?
>Date: Thu, 13 Oct 2005 12:53:10 -0600
>
>Have you looked at the g.data package?  It might be useful
>(but may still require some redesign of your dataset).
>
>Greg Snow, Ph.D.
>Statistical Data Center, LDS Hospital
>Intermountain Health Care
>greg.snow at ihc.com
>(801) 408-8111
>
> >>> "Ken Termiso" <jerk_alert at hotmail.com> 10/13/05 08:14AM >>>
>
> >
> >I'd put the extra columns in their own data frame, and save that to
>disk
> >(use dates/times/process ids or some other unique identifier in the
> >filenames to distinguish them).  When you need access to a mixture of
>
> >columns, load (or source, depending how you did the save) the columns
>you
> >need, and cbind them together into one big data frame.
> >
> >If you are concerned about memory requirements when producing the
>pieces,
> >watch out that you don't write out so much data that you'll never have
>
> >enough memory to load all you need at once.
> >
> >Duncan Murdoch
>
>
>hmm...maybe i should just be dumping to a text file instead of a data
>frame..is there any way (without using a real SQL database) in R to
>create a
>file that i can selectively load certain columns from?
>
>if not, maybe i should break the data frame up into pieces (as you
>suggested) and create a separate file that keeps track of which columns
>are
>stored in which files (like a hashtable) and just load the small file
>of
>keys each time i need to load something..
>
>whaddya think??
>
>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide!
>http://www.R-project.org/posting-guide.html
>




More information about the R-help mailing list