[R] reorganizing a data frame
Jeff Miller
jdm at xnet.com
Tue Jul 11 00:58:32 CEST 2000
David,
Thanks for this info. David James was kind enough to send me the same solution in a private mail.
I hadn't realized that you can index into an array with a matrix, but I see
now that this is a very useful tool in R and S.
Duncan Murdoch sent me a more concise solution:
tapply(oldstock$close,list(oldstock$date,oldstock$ticker),mean)
which works well if the matrix "oldstock" has fewer than say 10,000 rows,
but which starts to bog down considerably for matrices with more rows.
The solution that you and David James sent is still quite fast for matrices
with 300,000 rows.
Thanks again to everyone for their insights.
Jeff Miller
----- Original Message -----
From: Brahm, David
To: 'Jeff Miller' ; r-help at stat.math.ethz.ch
Sent: Monday, July 10, 2000 11:44 AM
Subject: RE: [R] reorganizing a data frame
Jeff Miller wants to turn a dataframe (stockdata) containing date, ticker, and close into a matrix (closedata). Here's how I'd do it in S-Plus (sorry, I haven't tried this in R):
dates <- sort(unique(stockdata$date))
tickers <- sort(unique(stockdata$ticker))
closedata <- matrix(NA, length(dates), length(tickers), dimnames=list(as.character(dates), tickers))
idx <- cbind(match(stockdata$date, dates), match(stockdata$ticker, tickers))
closedata[idx] <- stockdata$close
The key here is knowing that you can index into a matrix (closedata) with an Nx2 matrix (idx), each row of which represents one element's coordinates. This method is especially efficient if your matrix "closedata" is sparse.
P.S. The "as.character" is there because S-Plus 5.1 allows for non-character dimnames, which seems foolish to me, and I use numbers for dates.
-- David Brahm
Fidelity Investments
(617)563-7438
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://stat.ethz.ch/pipermail/r-help/attachments/20000710/3d7da968/attachment.html
More information about the R-help
mailing list