[R] Finding a mean value of a variable holding a dummyvariablefixed
Bill.Venables at csiro.au
Bill.Venables at csiro.au
Mon Mar 31 05:50:13 CEST 2008
Here is another way, starting with similar dummy data:
____________
PMData <-
data.frame(PM = c("Thatcher", "Thatcher", "Thatcher", "Thatcher",
"Thatcher", "Thatcher","Major", "Major", "Major",
"Major", "Major", "Major"),
approval = c(3, 4, 5, 6, 7, 8, 6, 5, 4, 3, 2, 1))
PMData <- transform(PMData, Month = 1:nrow(PMData)) ## add the time variable
PM_average <- with(PMData, tapply(approval, PM, mean))
PM_span <- with(PMData, sapply(tapply(Month, PM, range),
function(x) structure(approval[Month[x]],
names = c("First", "Last"))))
____________
> rbind(mean = PM_average, PM_span)
Major Thatcher
mean 3.5 5.5
First 6.0 3.0
Last 1.0 8.0
(I don't recall any Prime Minister called Johnson or Nixon, by the way...)
Bill Venables
CSIRO Laboratories
PO Box 120, Cleveland, 4163
AUSTRALIA
Office Phone (email preferred): +61 7 3826 7251
Fax (if absolutely necessary): +61 7 3826 7304
Mobile: +61 4 8819 4402
Home Phone: +61 7 3286 7700
mailto:Bill.Venables at csiro.au
http://www.cmis.csiro.au/bill.venables/
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Daniel Malter
Sent: Monday, 31 March 2008 1:22 PM
To: 'Alexander Ovodenko'; r-help at r-project.org
Subject: Re: [R] Finding a mean value of a variable holding a dummyvariablefixed
I found a solution. It's probably not the easiest one, but it works. It
assumes that your data frame is ordered from earliest to latest record for
each president, but it can be easily adjusted if you want to make it
dependent on a third column. The final vector "index" gives you the line
indices for the first record for each president. If you replace "min" by
"max" you get the last instead of the first record. You can then find the
values by
##Sample data
president=c("Johnson","Johnson","Johnson","Johnson","Johnson","Johnson","Nix
on","Nixon","Nixon","Nixon","Nixon","Nixon")
approval=c(3,4,5,6,7,8,6,5,4,3,2,1)
tapply(approval,president,mean)
##Find index for first row of each president; assumes ascending order of
observations; change "min" to "max" to find last record
index=NULL
for(i in 1:length(unique(president)))
index[i]=min(which((president==unique(president)[i])==TRUE))
index
##Generate table with first approvals
first.approval=data.frame(cbind(index,president[index],approval[index]))
names(first.approval)=c("Index","President","Approval")
first.approval
Cheers,
Daniel
-------------------------
cuncta stricte discussurus
-------------------------
-----Ursprüngliche Nachricht-----
Von: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Im
Auftrag von Alexander Ovodenko
Gesendet: Sunday, March 30, 2008 9:47 PM
An: r-help at r-project.org
Betreff: [R] Finding a mean value of a variable holding a dummy
variablefixed
I have time-series data on approval ratings of British Prime Ministers. The
prime ministers dating from MacMillan onward till today are coded as dummy
variables and the approval ratings are entered for each month. I want to
know the mean value of the approval rating of each Prime Minister in the
dataset and the approval rating during his/her first month and last month as
PM. What R code should I enter for these data? In other words, I want hold
the dummy corresponding to each Prime Minister fixed at value one and know
the first rating that PM has, the last rating s/he has, and the mean rating
s/he has. Thanks.
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list