[R] List of tables rather than an extra dimension in the table or (l)apply(xtabs)

Mulholland, Tom Tom.Mulholland at dpi.wa.gov.au
Tue Mar 22 07:18:25 CET 2005


I wrote a function that created the crosstab  and removed the extraneous lines and then used lapply


aestabs <- function(x){
   temp <- xtabs(psn ~ lga + year,x)
   temp <- temp[rowSums(temp) != 0,]
   return(temp)
   }
   
eas2 <- lapply(split(ipi$eas,ipi$eas$RegionNum),aestabs)

It's not really reuseable. I guess I could pass a formula and work out a better method of subsetting dimensions (where certain factor levels are not used. But maybe someone has an elegant method they could share.

Tom

> -----Original Message-----
> From: Mulholland, Tom 
> Sent: Tuesday, 22 March 2005 1:35 PM
> To: R-Help (E-mail)
> Subject: [R] List of tables rather than an extra dimension in 
> the table
> or (l)apply(xtabs)
> 
> 
> I'm not sure how to best explain what I am after but here 
> goes. I have a data frame with 2 geographical factors. One is 
> the major region the other is the component regions.
> 
> I am trying to process all the regions at the same time 
> without using "for". So I need (think, I do)  a list of 
> matrices each structured according to the number of 
> subregions within each region.
> 
> So is there a way of using lapply with xtabs or is there a 
> better way to achieve my desired output?
> 
> Using the Titanic data as an example
> 
> t1 <- as.data.frame(Titanic)
> t2 <- split(t1,t1$Class)
> 
> # I would then drop any unused levels in the factors for the 
> geography creating distinctly different data.frames (see end 
> of message)
> 
> > xtabs(Freq ~ Age + Sex + Class,t1)
> , , Class = 1st
> 
>        Sex
> Age     Male Female
>   Child   5    1   
>   Adult 175  144   
> 
> , , Class = 2nd
> 
>        Sex
> Age     Male Female
>   Child  11   13   
>   Adult 168   93   
> 
> , , Class = 3rd
> 
>        Sex
> Age     Male Female
>   Child  48   31   
>   Adult 462  165   
> 
> , , Class = Crew
> 
>        Sex
> Age     Male Female
>   Child   0    0   
>   Adult 862   23   
> 
> Can I do something with t2 to produce a list which is in 
> effect an Age by Sex crosstab with one item for each value of 
> Class. I would be wanting to drop.unused.levels, so that the 
> last part of the table is just 
> 
>        Sex
> Age     Male Female
>   Adult 862   23   
> 
> or in my case each item in the list has the same number of 
> rows as there are subregions for that region.
> 
> List of 9
>  $ 1:`data.frame':      4009 obs. of  7 variables:
>   ..$ sex      : Factor w/ 2 levels "Females","Males": 2 2 2 
> 2 2 2 2 2 2 2 ...
>   ..$ age      : Factor w/ 18 levels "0-4","5-9","10-14",..: 
> 1 1 1 1 1 1 1 1 1 1 ...
>   ..$ lga      : Factor w/ 23 levels "Carnamah (S)",..: 1 2 3 
> 4 5 6 7 8 9 10 ...   # 23 subregions
>   ..$ psn      : num [1:4009] 71 336 26 84 30 133 904 385 99 110 ...
>   ..$ year     : num [1:4009] 1991 1991 1991 1991 1991 ...
>   ..$ agecomp  : Factor w/ 14 levels "0-4","5-9","10-14",..: 
> 1 1 1 1 1 1 1 1 1 1 ...
>   ..$ RegionNum: num [1:4009] 1 1 1 1 1 1 1 1 1 1 ...
>  $ 2:`data.frame':      720 obs. of  7 variables:
>   ..$ sex      : Factor w/ 2 levels "Females","Males": 2 2 2 
> 2 2 2 2 2 2 2 ...
>   ..$ age      : Factor w/ 18 levels "0-4","5-9","10-14",..: 
> 1 1 1 1 2 2 2 2 3 3 ...
>   ..$ lga      : Factor w/ 4 levels "Broome (S)","De..",..: 1 
> 2 3 4 1 2 3 4 1 2 ... # 4 subregions etc
>   ..$ psn      : num [1:720] 495 445 189 377 415 374 189 330 
> 324 319 ...
>   ..$ year     : num [1:720] 1991 1991 1991 1991 1991 ...
>   ..$ agecomp  : Factor w/ 14 levels "0-4","5-9","10-14",..: 
> 1 1 1 1 2 2 2 2 3 3 ...
>   ..$ RegionNum: num [1:720] 2 2 2 2 2 2 2 2 2 2 ...
> 
> So these two items would produce
> 
> > round(xtabs(psn ~ lga + agecomp,eas[[1]]),-2)
>                     agecomp
> lga                  0-4   5-9   10-14 15-19 20-24 25-29 
> 30-34 35-39 40-44 45-49 50-54 55-59 60-64 65plus
>   Carnamah (S)         500   400   300   200   300   300   
> 500   400   400   300   300   200   100   300 
>   Carnarvon (S)       2800  3000  2600  2100  2400  2700  
> 2800  2600  2400  2200  2000  1600  1300  2800 
>   Chapman Valley (S)   300   400   300   200   200   300   
> 300   300   300   400   400   300   200   300 
>   Coorow (S)           700   700   600   200   300   600   
> 700   600   500   500   400   400   300   500 
>   Cue (S)              200   200   100   100   200   200   
> 300   200   200   200   200   100   100   100 
>   Exmouth (S)          900  1000   800   600   700  1100  
> 1100  1100  1100   800   700   500   400   700 
>   Geraldton (C)       7700  7700  8100  8200  7200  7400  
> 7500  7200  6900  6100  5400  4600  4300 12400 
>   Greenough (S)       4700  5400  5500  4400  3100  3700  
> 4800  5100  5200  4200  3500  2600  1900  3200 
>   Irwin (S)           1000  1100  1000   600   600   900  
> 1000  1200  1000   900   800   900   800  1800 
>   Meekatharra (S)      800   700   600   600   900  1000   
> 900   700   600   500   400   300   200   400 
>   Mingenew (S)         300   300   200   100   200   200   
> 300   300   200   200   200   200   100   200 
>   Morawa (S)           400   500   400   400   200   400   
> 500   400   300   300   300   300   200   500 
>   Mount Magnet (S)     500   400   300   200   400   500   
> 400   400   300   300   200   200   100   200 
>   Mullewa (S)          600   600   800   400   400   500   
> 500   400   300   300   300   300   200   400 
>   Murchison (S)        100   100   100   100     0   100   
> 100     0     0     0   100     0     0     0 
>   Northampton (S)     1300  1300  1200   700   700   900  
> 1200  1300  1200  1200  1000  1000   900  2000 
>   Perenjori (S)        300   300   300   100   200   200   
> 300   300   300   200   200   200   100   300 
>   Sandstone (S)          0     0     0     0   100   100   
> 100   100   100   100   100   100     0   100 
>   Shark Bay (S)        300   300   200   200   200   300   
> 400   400   400   300   300   300   200   600 
>   Three Springs (S)    300   300   300   100   200   300   
> 400   300   300   200   300   200   200   400 
>   Upper Gascoyne (S)   100   200   200   100   100   100   
> 100   100   100   100   100   100   100   100 
>   Wiluna (S)           200   200   200   300   600   700   
> 600   400   300   300   300   200   100   100 
>   Yalgoo (S)           100   100   100     0   200   200   
> 200   100   200   200   100   100   100   100 
> > round(xtabs(psn ~ lga + agecomp,eas[[2]]),-2)
>                             agecomp
> lga                          0-4  5-9  10-14 15-19 20-24 
> 25-29 30-34 35-39 40-44 45-49 50-54 55-59 60-64 65plus
>   Broome (S)                 5600 5400 4500  3900  4900  5800 
>  6100  5500  4500  3700  2800  2000  1500  2200  
>   Derby-West Kimberley (S)   4000 3900 3400  3100  3800  4000 
>  3800  3100  2500  1900  1500  1200   900  1800  
>   Halls Creek (S)            2100 2100 1700  1600  1800  1600 
>  1400  1100  1000   900   700   600   400   800  
>   Wyndham-East Kimberley (S) 3500 3300 2800  2300  2900  3500 
>  3500  3000  2600  2100  1800  1300   800  1200  
> 
>  $ 3:`data.frame':      2130 obs. of  7 variables:
>   ..$ sex      : Factor w/ 2 levels "Females","Males": 2 2 2 
> 2 2 2 2 2 2 2 ...
>   ..$ age      : Factor w/ 18 levels "0-4","5-9","10-14",..: 
> 1 1 1 1 1 1 1 1 1 1 ...
>   ..$ lga      : Factor w/ 12 levels "Albany (C)","Br..",..: 
> 1 2 3 4 5 6 7 8 9 10 ...
>   ..$ psn      : num [1:2130] 1107   21   63  167  115 ...
>   ..$ year     : num [1:2130] 1991 1991 1991 1991 1991 ...
>   ..$ agecomp  : Factor w/ 14 levels "0-4","5-9","10-14",..: 
> 1 1 1 1 1 1 1 1 1 1 ...
>   ..$ RegionNum: num [1:2130] 3 3 3 3 3 3 3 3 3 3 ...
>  $ 4:`data.frame':      5188 obs. of  7 variables:
>   ..$ sex      : Factor w/ 2 levels "Females","Males": 2 2 2 
> 2 2 2 2 2 2 2 ...
>   ..$ age      : Factor w/ 18 levels "0-4","5-9","10-14",..: 
> 1 1 1 1 1 1 1 1 1 1 ...
>   ..$ lga      : Factor w/ 29 levels "Beverley (S)",..: 1 2 3 
> 4 5 6 7 8 9 10 ...
>   ..$ psn      : num [1:5188] 55 58 84 90 105 134 57 132 56 70 ...
>   ..$ year     : num [1:5188] 1991 1991 1991 1991 1991 ...
>   ..$ agecomp  : Factor w/ 14 levels "0-4","5-9","10-14",..: 
> 1 1 1 1 1 1 1 1 1 1 ...
>   ..$ RegionNum: num [1:5188] 4 4 4 4 4 4 4 4 4 4 ...
>  $ 5:`data.frame':      5400 obs. of  7 variables:
>   ..$ sex      : Factor w/ 2 levels "Females","Males": 2 2 2 
> 2 2 2 2 2 2 2 ...
>   ..$ age      : Factor w/ 18 levels "0-4","5-9","10-14",..: 
> 1 1 1 1 1 1 1 1 1 1 ...
>   ..$ lga      : Factor w/ 30 levels "Armadale (C)",..: 1 2 3 
> 4 5 6 7 8 9 10 ...
>   ..$ psn      : num [1:5400] 2163  479 1824  865  749 ...
>   ..$ year     : num [1:5400] 1991 1991 1991 1991 1991 ...
>   ..$ agecomp  : Factor w/ 14 levels "0-4","5-9","10-14",..: 
> 1 1 1 1 1 1 1 1 1 1 ...
>   ..$ RegionNum: num [1:5400] 5 5 5 5 5 5 5 5 5 5 ...
>  $ 6:`data.frame':      720 obs. of  7 variables:
>   ..$ sex      : Factor w/ 2 levels "Females","Males": 2 2 2 
> 2 2 2 2 2 2 2 ...
>   ..$ age      : Factor w/ 18 levels "0-4","5-9","10-14",..: 
> 1 1 1 1 2 2 2 2 3 3 ...
>   ..$ lga      : Factor w/ 4 levels "Ashburton (S)",..: 1 2 3 
> 4 1 2 3 4 1 2 ...
>   ..$ psn      : num [1:720] 532 624 699 930 433 539 689 846 
> 320 379 ...
>   ..$ year     : num [1:720] 1991 1991 1991 1991 1991 ...
>   ..$ agecomp  : Factor w/ 14 levels "0-4","5-9","10-14",..: 
> 1 1 1 1 2 2 2 2 3 3 ...
>   ..$ RegionNum: num [1:720] 6 6 6 6 6 6 6 6 6 6 ...
>  $ 7:`data.frame':      1601 obs. of  7 variables:
>   ..$ sex      : Factor w/ 2 levels "Females","Males": 2 2 2 
> 2 2 2 2 2 2 2 ...
>   ..$ age      : Factor w/ 18 levels "0-4","5-9","10-14",..: 
> 1 1 1 1 1 1 1 1 1 2 ...
>   ..$ lga      : Factor w/ 9 levels "Coolgardie ..",..: 1 2 3 
> 4 5 6 7 8 9 1 ...
>   ..$ psn      : num [1:1601]  342  105  534 1352   85 ...
>   ..$ year     : num [1:1601] 1991 1991 1991 1991 1991 ...
>   ..$ agecomp  : Factor w/ 14 levels "0-4","5-9","10-14",..: 
> 1 1 1 1 1 1 1 1 1 2 ...
>   ..$ RegionNum: num [1:1601] 7 7 7 7 7 7 7 7 7 7 ...
>  $ 8:`data.frame':      2880 obs. of  7 variables:
>   ..$ sex      : Factor w/ 2 levels "Females","Males": 2 2 2 
> 2 2 2 2 2 2 2 ...
>   ..$ age      : Factor w/ 18 levels "0-4","5-9","10-14",..: 
> 1 1 1 1 1 1 1 1 1 1 ...
>   ..$ lga      : Factor w/ 16 levels "Augusta-Mar..",..: 1 2 
> 3 4 5 6 7 8 9 10 ...
>   ..$ psn      : num [1:2880]  294   66   85  188 1144 ...
>   ..$ year     : num [1:2880] 1991 1991 1991 1991 1991 ...
>   ..$ agecomp  : Factor w/ 14 levels "0-4","5-9","10-14",..: 
> 1 1 1 1 1 1 1 1 1 1 ...
>   ..$ RegionNum: num [1:2880] 8 8 8 8 8 8 8 8 8 8 ...
>  $ 9:`data.frame':      2694 obs. of  7 variables:
>   ..$ sex      : Factor w/ 2 levels "Females","Males": 2 2 2 
> 2 2 2 2 2 2 2 ...
>   ..$ age      : Factor w/ 18 levels "0-4","5-9","10-14",..: 
> 1 1 1 1 1 1 1 1 1 1 ...
>   ..$ lga      : Factor w/ 15 levels "Brookton (S)",..: 1 2 3 
> 4 5 6 7 9 8 10 ...
>   ..$ psn      : num [1:2694] 49 67 38 46 67 51 104 214 44 69 ...
>   ..$ year     : num [1:2694] 1991 1991 1991 1991 1991 ...
>   ..$ agecomp  : Factor w/ 14 levels "0-4","5-9","10-14",..: 
> 1 1 1 1 1 1 1 1 1 1 ...
>   ..$ RegionNum: num [1:2694] 9 9 9 9 9 9 9 9 9 9 ...
> 
> platform i386-pc-mingw32
> arch     i386           
> os       mingw32        
> system   i386, mingw32  
> status                  
> major    2              
> minor    0.1            
> year     2004           
> month    11             
> day      15             
> language R
> 
> Tom Mulholland
> Senior Demographer
> Spatial Information and Research
> State and Regional Policy
> Department for Planning and Infrastructure
> Perth, Western Australia
> +61 (08) 9264 7936
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>




More information about the R-help mailing list