[R] R code for overlapping variables -- count

Leo Mada |eo@m@d@ @end|ng |rom @yon|c@eu
Sun Jun 2 19:41:00 CEST 2024


Correcting a small glitch - see new code.

________________________________
From: Leo Mada <leo.mada using syonic.eu>
Sent: Sunday, June 2, 2024 8:34 PM
To: Shadee Ashtari <shadee.ashtari using gmail.com>
Cc: r-help using r-project.org <r-help using r-project.org>
Subject: [R] R code for overlapping variables -- count

Dear Shadee,

If you have a data.frame with the following columns:

n = 100; # population size
x = data.frame(
      Sex = sample(c("M","F"), n, T),
      Country = sample(c("AA", "BB", "US"), n, T),
      Income  = as.factor(sample(1:3, n, T))
)

# Dummy variable
ONE = rep(1, nrow(x))

# corrected
r = aggregate(ONE ~ Sex + Income + Country, length, data = x)
r = r[, c("Country", "Income", "Sex", "ONE")]
names(r)[4] = "Count"
print(r)

It is possible to write more simple code, if you need only the particular combination of variables (which you specified in your mail). But this is the more general approach.

Note: you may want to use "sum" instead of "length", e.g. if you have a column specifying the number of individuals in that category.


Hope this helps,

Leonard


	[[alternative HTML version deleted]]



More information about the R-help mailing list