[R] Reshaping data with xtabs giving me 'extra' data

Tony B tony.breyal at googlemail.com
Wed Jan 20 13:20:53 CET 2010


Dear all,

Lets say I have several data frames as follows:

> set.seed(42)
> dates <- as.Date(c("2010-01-19", "2010-01-20"))
> times <- c("09:30:00", "11:30:00", "13:30:00", "15:30:00")
> shows <- c("Red Dwarf", "Being Human", "Doctor Who")
>
> df1 <- data.frame(Date = dates[1], Time = times[1], Show = shows, Score = 1:3)
> df2 <- data.frame(Date = dates[1], Time = times[2], Show = shows, Score = 1:3)
> df3 <- data.frame(Date = dates[1], Time = times[4], Show = shows, Score = 1:3)
> df4 <- data.frame(Date = dates[2], Time = times[1], Show = shows, Score = 1:3)
> df5 <- data.frame(Date = dates[2], Time = times[2], Show = shows, Score = 1:3)
> df6 <- data.frame(Date = dates[2], Time = times[3], Show = shows, Score = 1:3)
> df7 <- data.frame(Date = dates[2], Time = times[4], Show = shows, Score = 1:3)
> df7
        Date     Time        Show Score
1 2010-01-20 15:30:00   Red Dwarf     1
2 2010-01-20 15:30:00 Being Human     2
3 2010-01-20 15:30:00  Doctor Who     3

I would like to somehow reshape the data into a different format:

> df.list <- list(df1, df2, df3, df4, df5, df6, df7)
> my.df <- Reduce(function(x, y) merge(x, y, all=TRUE), df.list, accumulate=F)
> my.xtab <- xtabs(as.numeric(Score) ~ Date + Show + Time, data = my.df)

This is where my problem occurs. In Time = 13:30:00, there is now data
for "2010-01-19" which was not in any of my original data frames
above:

> # I do not want the zeros below
> my.xtab[,,"13:30:00"]
            Show
Date         Being Human Doctor Who Red Dwarf
  2010-01-19           0          0         0
  2010-01-20           2          3         1

Perhaps I am missing something in the way i call the xtabs function?

Thank you kindly for your time,
Tony Breyal

OS: Windows XP 64bit
> sessionInfo()
R version 2.10.0 (2009-10-26)
i386-pc-mingw32

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
States.1252    LC_MONETARY=English_United States.1252
LC_NUMERIC=C                           LC_TIME=English_United States.
1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods
base

loaded via a namespace (and not attached):
[1] tools_2.10.0



More information about the R-help mailing list