[R] List of tables rather than an extra dimension in the table or (l)apply(xtabs)
Mulholland, Tom
Tom.Mulholland at dpi.wa.gov.au
Tue Mar 22 07:18:25 CET 2005
I wrote a function that created the crosstab and removed the extraneous lines and then used lapply
aestabs <- function(x){
temp <- xtabs(psn ~ lga + year,x)
temp <- temp[rowSums(temp) != 0,]
return(temp)
}
eas2 <- lapply(split(ipi$eas,ipi$eas$RegionNum),aestabs)
It's not really reuseable. I guess I could pass a formula and work out a better method of subsetting dimensions (where certain factor levels are not used. But maybe someone has an elegant method they could share.
Tom
> -----Original Message-----
> From: Mulholland, Tom
> Sent: Tuesday, 22 March 2005 1:35 PM
> To: R-Help (E-mail)
> Subject: [R] List of tables rather than an extra dimension in
> the table
> or (l)apply(xtabs)
>
>
> I'm not sure how to best explain what I am after but here
> goes. I have a data frame with 2 geographical factors. One is
> the major region the other is the component regions.
>
> I am trying to process all the regions at the same time
> without using "for". So I need (think, I do) a list of
> matrices each structured according to the number of
> subregions within each region.
>
> So is there a way of using lapply with xtabs or is there a
> better way to achieve my desired output?
>
> Using the Titanic data as an example
>
> t1 <- as.data.frame(Titanic)
> t2 <- split(t1,t1$Class)
>
> # I would then drop any unused levels in the factors for the
> geography creating distinctly different data.frames (see end
> of message)
>
> > xtabs(Freq ~ Age + Sex + Class,t1)
> , , Class = 1st
>
> Sex
> Age Male Female
> Child 5 1
> Adult 175 144
>
> , , Class = 2nd
>
> Sex
> Age Male Female
> Child 11 13
> Adult 168 93
>
> , , Class = 3rd
>
> Sex
> Age Male Female
> Child 48 31
> Adult 462 165
>
> , , Class = Crew
>
> Sex
> Age Male Female
> Child 0 0
> Adult 862 23
>
> Can I do something with t2 to produce a list which is in
> effect an Age by Sex crosstab with one item for each value of
> Class. I would be wanting to drop.unused.levels, so that the
> last part of the table is just
>
> Sex
> Age Male Female
> Adult 862 23
>
> or in my case each item in the list has the same number of
> rows as there are subregions for that region.
>
> List of 9
> $ 1:`data.frame': 4009 obs. of 7 variables:
> ..$ sex : Factor w/ 2 levels "Females","Males": 2 2 2
> 2 2 2 2 2 2 2 ...
> ..$ age : Factor w/ 18 levels "0-4","5-9","10-14",..:
> 1 1 1 1 1 1 1 1 1 1 ...
> ..$ lga : Factor w/ 23 levels "Carnamah (S)",..: 1 2 3
> 4 5 6 7 8 9 10 ... # 23 subregions
> ..$ psn : num [1:4009] 71 336 26 84 30 133 904 385 99 110 ...
> ..$ year : num [1:4009] 1991 1991 1991 1991 1991 ...
> ..$ agecomp : Factor w/ 14 levels "0-4","5-9","10-14",..:
> 1 1 1 1 1 1 1 1 1 1 ...
> ..$ RegionNum: num [1:4009] 1 1 1 1 1 1 1 1 1 1 ...
> $ 2:`data.frame': 720 obs. of 7 variables:
> ..$ sex : Factor w/ 2 levels "Females","Males": 2 2 2
> 2 2 2 2 2 2 2 ...
> ..$ age : Factor w/ 18 levels "0-4","5-9","10-14",..:
> 1 1 1 1 2 2 2 2 3 3 ...
> ..$ lga : Factor w/ 4 levels "Broome (S)","De..",..: 1
> 2 3 4 1 2 3 4 1 2 ... # 4 subregions etc
> ..$ psn : num [1:720] 495 445 189 377 415 374 189 330
> 324 319 ...
> ..$ year : num [1:720] 1991 1991 1991 1991 1991 ...
> ..$ agecomp : Factor w/ 14 levels "0-4","5-9","10-14",..:
> 1 1 1 1 2 2 2 2 3 3 ...
> ..$ RegionNum: num [1:720] 2 2 2 2 2 2 2 2 2 2 ...
>
> So these two items would produce
>
> > round(xtabs(psn ~ lga + agecomp,eas[[1]]),-2)
> agecomp
> lga 0-4 5-9 10-14 15-19 20-24 25-29
> 30-34 35-39 40-44 45-49 50-54 55-59 60-64 65plus
> Carnamah (S) 500 400 300 200 300 300
> 500 400 400 300 300 200 100 300
> Carnarvon (S) 2800 3000 2600 2100 2400 2700
> 2800 2600 2400 2200 2000 1600 1300 2800
> Chapman Valley (S) 300 400 300 200 200 300
> 300 300 300 400 400 300 200 300
> Coorow (S) 700 700 600 200 300 600
> 700 600 500 500 400 400 300 500
> Cue (S) 200 200 100 100 200 200
> 300 200 200 200 200 100 100 100
> Exmouth (S) 900 1000 800 600 700 1100
> 1100 1100 1100 800 700 500 400 700
> Geraldton (C) 7700 7700 8100 8200 7200 7400
> 7500 7200 6900 6100 5400 4600 4300 12400
> Greenough (S) 4700 5400 5500 4400 3100 3700
> 4800 5100 5200 4200 3500 2600 1900 3200
> Irwin (S) 1000 1100 1000 600 600 900
> 1000 1200 1000 900 800 900 800 1800
> Meekatharra (S) 800 700 600 600 900 1000
> 900 700 600 500 400 300 200 400
> Mingenew (S) 300 300 200 100 200 200
> 300 300 200 200 200 200 100 200
> Morawa (S) 400 500 400 400 200 400
> 500 400 300 300 300 300 200 500
> Mount Magnet (S) 500 400 300 200 400 500
> 400 400 300 300 200 200 100 200
> Mullewa (S) 600 600 800 400 400 500
> 500 400 300 300 300 300 200 400
> Murchison (S) 100 100 100 100 0 100
> 100 0 0 0 100 0 0 0
> Northampton (S) 1300 1300 1200 700 700 900
> 1200 1300 1200 1200 1000 1000 900 2000
> Perenjori (S) 300 300 300 100 200 200
> 300 300 300 200 200 200 100 300
> Sandstone (S) 0 0 0 0 100 100
> 100 100 100 100 100 100 0 100
> Shark Bay (S) 300 300 200 200 200 300
> 400 400 400 300 300 300 200 600
> Three Springs (S) 300 300 300 100 200 300
> 400 300 300 200 300 200 200 400
> Upper Gascoyne (S) 100 200 200 100 100 100
> 100 100 100 100 100 100 100 100
> Wiluna (S) 200 200 200 300 600 700
> 600 400 300 300 300 200 100 100
> Yalgoo (S) 100 100 100 0 200 200
> 200 100 200 200 100 100 100 100
> > round(xtabs(psn ~ lga + agecomp,eas[[2]]),-2)
> agecomp
> lga 0-4 5-9 10-14 15-19 20-24
> 25-29 30-34 35-39 40-44 45-49 50-54 55-59 60-64 65plus
> Broome (S) 5600 5400 4500 3900 4900 5800
> 6100 5500 4500 3700 2800 2000 1500 2200
> Derby-West Kimberley (S) 4000 3900 3400 3100 3800 4000
> 3800 3100 2500 1900 1500 1200 900 1800
> Halls Creek (S) 2100 2100 1700 1600 1800 1600
> 1400 1100 1000 900 700 600 400 800
> Wyndham-East Kimberley (S) 3500 3300 2800 2300 2900 3500
> 3500 3000 2600 2100 1800 1300 800 1200
>
> $ 3:`data.frame': 2130 obs. of 7 variables:
> ..$ sex : Factor w/ 2 levels "Females","Males": 2 2 2
> 2 2 2 2 2 2 2 ...
> ..$ age : Factor w/ 18 levels "0-4","5-9","10-14",..:
> 1 1 1 1 1 1 1 1 1 1 ...
> ..$ lga : Factor w/ 12 levels "Albany (C)","Br..",..:
> 1 2 3 4 5 6 7 8 9 10 ...
> ..$ psn : num [1:2130] 1107 21 63 167 115 ...
> ..$ year : num [1:2130] 1991 1991 1991 1991 1991 ...
> ..$ agecomp : Factor w/ 14 levels "0-4","5-9","10-14",..:
> 1 1 1 1 1 1 1 1 1 1 ...
> ..$ RegionNum: num [1:2130] 3 3 3 3 3 3 3 3 3 3 ...
> $ 4:`data.frame': 5188 obs. of 7 variables:
> ..$ sex : Factor w/ 2 levels "Females","Males": 2 2 2
> 2 2 2 2 2 2 2 ...
> ..$ age : Factor w/ 18 levels "0-4","5-9","10-14",..:
> 1 1 1 1 1 1 1 1 1 1 ...
> ..$ lga : Factor w/ 29 levels "Beverley (S)",..: 1 2 3
> 4 5 6 7 8 9 10 ...
> ..$ psn : num [1:5188] 55 58 84 90 105 134 57 132 56 70 ...
> ..$ year : num [1:5188] 1991 1991 1991 1991 1991 ...
> ..$ agecomp : Factor w/ 14 levels "0-4","5-9","10-14",..:
> 1 1 1 1 1 1 1 1 1 1 ...
> ..$ RegionNum: num [1:5188] 4 4 4 4 4 4 4 4 4 4 ...
> $ 5:`data.frame': 5400 obs. of 7 variables:
> ..$ sex : Factor w/ 2 levels "Females","Males": 2 2 2
> 2 2 2 2 2 2 2 ...
> ..$ age : Factor w/ 18 levels "0-4","5-9","10-14",..:
> 1 1 1 1 1 1 1 1 1 1 ...
> ..$ lga : Factor w/ 30 levels "Armadale (C)",..: 1 2 3
> 4 5 6 7 8 9 10 ...
> ..$ psn : num [1:5400] 2163 479 1824 865 749 ...
> ..$ year : num [1:5400] 1991 1991 1991 1991 1991 ...
> ..$ agecomp : Factor w/ 14 levels "0-4","5-9","10-14",..:
> 1 1 1 1 1 1 1 1 1 1 ...
> ..$ RegionNum: num [1:5400] 5 5 5 5 5 5 5 5 5 5 ...
> $ 6:`data.frame': 720 obs. of 7 variables:
> ..$ sex : Factor w/ 2 levels "Females","Males": 2 2 2
> 2 2 2 2 2 2 2 ...
> ..$ age : Factor w/ 18 levels "0-4","5-9","10-14",..:
> 1 1 1 1 2 2 2 2 3 3 ...
> ..$ lga : Factor w/ 4 levels "Ashburton (S)",..: 1 2 3
> 4 1 2 3 4 1 2 ...
> ..$ psn : num [1:720] 532 624 699 930 433 539 689 846
> 320 379 ...
> ..$ year : num [1:720] 1991 1991 1991 1991 1991 ...
> ..$ agecomp : Factor w/ 14 levels "0-4","5-9","10-14",..:
> 1 1 1 1 2 2 2 2 3 3 ...
> ..$ RegionNum: num [1:720] 6 6 6 6 6 6 6 6 6 6 ...
> $ 7:`data.frame': 1601 obs. of 7 variables:
> ..$ sex : Factor w/ 2 levels "Females","Males": 2 2 2
> 2 2 2 2 2 2 2 ...
> ..$ age : Factor w/ 18 levels "0-4","5-9","10-14",..:
> 1 1 1 1 1 1 1 1 1 2 ...
> ..$ lga : Factor w/ 9 levels "Coolgardie ..",..: 1 2 3
> 4 5 6 7 8 9 1 ...
> ..$ psn : num [1:1601] 342 105 534 1352 85 ...
> ..$ year : num [1:1601] 1991 1991 1991 1991 1991 ...
> ..$ agecomp : Factor w/ 14 levels "0-4","5-9","10-14",..:
> 1 1 1 1 1 1 1 1 1 2 ...
> ..$ RegionNum: num [1:1601] 7 7 7 7 7 7 7 7 7 7 ...
> $ 8:`data.frame': 2880 obs. of 7 variables:
> ..$ sex : Factor w/ 2 levels "Females","Males": 2 2 2
> 2 2 2 2 2 2 2 ...
> ..$ age : Factor w/ 18 levels "0-4","5-9","10-14",..:
> 1 1 1 1 1 1 1 1 1 1 ...
> ..$ lga : Factor w/ 16 levels "Augusta-Mar..",..: 1 2
> 3 4 5 6 7 8 9 10 ...
> ..$ psn : num [1:2880] 294 66 85 188 1144 ...
> ..$ year : num [1:2880] 1991 1991 1991 1991 1991 ...
> ..$ agecomp : Factor w/ 14 levels "0-4","5-9","10-14",..:
> 1 1 1 1 1 1 1 1 1 1 ...
> ..$ RegionNum: num [1:2880] 8 8 8 8 8 8 8 8 8 8 ...
> $ 9:`data.frame': 2694 obs. of 7 variables:
> ..$ sex : Factor w/ 2 levels "Females","Males": 2 2 2
> 2 2 2 2 2 2 2 ...
> ..$ age : Factor w/ 18 levels "0-4","5-9","10-14",..:
> 1 1 1 1 1 1 1 1 1 1 ...
> ..$ lga : Factor w/ 15 levels "Brookton (S)",..: 1 2 3
> 4 5 6 7 9 8 10 ...
> ..$ psn : num [1:2694] 49 67 38 46 67 51 104 214 44 69 ...
> ..$ year : num [1:2694] 1991 1991 1991 1991 1991 ...
> ..$ agecomp : Factor w/ 14 levels "0-4","5-9","10-14",..:
> 1 1 1 1 1 1 1 1 1 1 ...
> ..$ RegionNum: num [1:2694] 9 9 9 9 9 9 9 9 9 9 ...
>
> platform i386-pc-mingw32
> arch i386
> os mingw32
> system i386, mingw32
> status
> major 2
> minor 0.1
> year 2004
> month 11
> day 15
> language R
>
> Tom Mulholland
> Senior Demographer
> Spatial Information and Research
> State and Regional Policy
> Department for Planning and Infrastructure
> Perth, Western Australia
> +61 (08) 9264 7936
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>
More information about the R-help
mailing list