[R] evaluating a function on an array of values
Faheem Mitha
faheem at email.unc.edu
Wed May 3 01:11:23 CEST 2000
Dear R people,
I recently spent some time trying to do the following. I wrote a function
called meanvar. This function has four arguments and returns a vector
of two values.
It looks like meanvar(r,t,type=c("interval","nominal"),n = 10000).
I wanted to write a routine to return to me the values of this function
for a range of values of r and t, specifically r from 1 to 5, t from 2 to
6, for fixed values of type (say "int"), and n the default. However, even
for(r in 1:5) meanvar(r,2,"int")
does not work, and I have no idea why.
I include the function below.
Some words of explanation may be in order. Suppose there are t people in a
room. Each gives one of the values 0,1...r, with equal probability. Call
these votes. Call these rvs C_{i}, where i= 1,...t. We write the function
DR (Disagreement Rate) = \sum_{i < j} \abs(C_i - C_j)/(r{t \choose 2}).
Ie. scaled sum over all possible pairwise differences of the votes. Call
AR (Agreement Rate) = 1 - DR. This is the "interval" case (not my name).
The nominal case is similar. In this case AR = \sum_{i < j}
\delta(C_i,C_j)/(r{t \choose 2}), where \delta(x,y) is 1 if x =y, 0
othewise.
What the function meanvar does is estimate E(AR), Var(AR) using
simulation.
In the unlikely event you have actually waded through all this turgid
nonsense, I would also like to ask whether the function below can be
improved. It appears to work and agrees with the special cases I have
managed to work out. Case 1: t=2, Case 2: r=1. However, the sub-functions
delta and sumdiff use loops, and these loops are evaluated n times, so
this can greatly affect the speed. These sub-functions could probably be
better. Any other suggestions for general improvement would be greatly
appreciated.
Sincerely, Faheem Mitha.
***************************************************************************
#Functions for estimating mean and variance of AR using simulation:
meanvar <- function(r,t,type=c("interval","nominal"), n = 10000)
{
# possible score values is 0,1,...r. (r should be at least 1).
# number of coders is t (Note this must be at least 2, else fn will not work).
# n is number of trials. The higher n, the better the precision of mean and var
# Taking n less than 1 of course makes no sense. At least 10000 recommended.
type <- match.arg(type)
if(is.integer(r) || (r<1)) #check for sensible values of r
stop(message = "r must be a positive integer")
if(is.integer(t) || (t<2)) #check for sensible values of t
stop(message = "t must be a positive integer and at least 2")
if(missing(n)) #setting default for n. If n not given, n=10000 is used
{
warning("value of n missing, using default n=1000")
}
if(is.integer(n) || (n<1)) #check for sensible values of n
stop(message = "n must be a positive integer")
x <- 0:r #x is possible values of votes
votes <- sample(x,n*t,replace=TRUE) #creating random vector of votes
dim(votes) <- c(t,n) #making the votes vector into a matrix
# Need function to compute sum of all pairwise absolute differences of
# the elements of a vector. Note: I am scaling sum appropriately to
# give sample values of DR (Disagreement Rate) before returning
# result. This function is used for the interval scale case.
sumdiff <- function(v)
{llen <- length(v) - 1
summer <- 0 #initialising sum
for(i in 1:llen) #using diff to make sum (using loop)
{summer <- sum(abs(diff(v,i))) + summer}
return(summer/(r*choose(length(v),2)))#scaling sum before returning result
}
# Need function to compute sum of all pairwise deltas of the elements
# of a vector. Recall that delta(x,y) is defined as 1 if x=y and 0 otherwise
delta <- function(v)
{llen <- length(v) - 1
summer <- 0 #initialising sum
for(i in 1:llen) #using diff to make sum (using loop)
{summer <- sum(ifelse(diff(v,i)==0,0,1)) + summer}
return(summer/(r*choose(length(v),2)))#scaling sum before returning result
}
result <- switch(type,
interval = c(1 - mean(apply(votes,2,sumdiff)),var(apply(votes,2,sumdiff))),
nominal = c(mean(apply(votes,2,delta)),var(apply(votes,2,delta))) )
result # Returning values of mean and variance as vector
}
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list