[R] R code for overlapping variables -- count
Rui Barradas
ru|pb@rr@d@@ @end|ng |rom @@po@pt
Sun Jun 2 19:40:51 CEST 2024
Às 18:34 de 02/06/2024, Leo Mada via R-help escreveu:
> Dear Shadee,
>
> If you have a data.frame with the following columns:
>
> n = 100; # population size
> x = data.frame(
> Sex = sample(c("M","F"), n, T),
> Country = sample(c("AA", "BB", "US"), n, T),
> Income = as.factor(sample(1:3, n, T))
> )
>
> # Dummy variable
> ONE = rep(1, nrow(x))
>
> r = aggregate(ONE ~ Sex + Income + Country, length, data = x)
> r = r[, c("Country", "Income", "Sex")]
> print(r)
>
> It is possible to write more simple code, if you need only the particular combination of variables (which you specified in your mail). But this is the more general approach.
>
> Note: you may want to use "sum" instead of "length", e.g. if you have a column specifying the number of individuals in that category.
>
>
> Hope this helps,
>
> Leonard
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Hello,
The following is simpler.
r2 <- xtabs(~ ., x) |> as.data.frame()
r2[-4L] # or r2[names(r2) != "Freq"]
Hope this helps,
Rui Barradas
--
Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus.
www.avg.com
More information about the R-help
mailing list