[R] Random Relabelling
I KNEW there was a better way!
The following follows John's suggestion, but without the loop. It's quick for me.
> loop. It's quick
> for me.
Jeremy
## Generate sample data
n <- 4000
rep <- 1000
rate <- rnorm(n, mean = 15, sd = 2) / 100000 # Mortality rates around 15/100k
> rates around
> 15/100k
## Create an empty matrix with appropriate dimensions
permutations <- matrix(ncol = n, nrow = rep)
>
## Use apply() to resample
permutations <- apply(permutations, 1, function(x) {
> {
})
> })
## Look at the matrix
dim(permutations)
head(permutations)
## Find the column means
means <- apply(permutations, 1, mean)
means
> > There is probably a better way to do this but a for
> loop like this should
> > work. You would just need to change the numbers to
> yours and then add on the
> > locations
> =========================================================
> >
scores <- 1:5
mydata <- matrix(data=NA, nrow=5, ncol=10)
> >
for(i in 1:10) {
mydata[,i] <- sample(scores, 5, replace=FALSE)
}
> >
> =========================================================
> >
> > I have a map of Iowa of with 4000 locations. At
> each location, I have a
> > cancer mortality rate. I need to test my null
> hypothesis; that the spatial
> > distribution of the mortality rates is
> random. For this test, I need to
> > establish a spatial reference distribution.
> >
> > My reference distribution will be created by some
> random relabelling
> > algorithm. The 4000 locations would remain
> fixed, but the observed
> > mortality rates would be randomly redistributed.
> Then, I want 1000
> > permutations of the same algorithm. For each of
> those 1000 times, I would
> > record the redistributed mortality rate at each
> location. Then, I would
> > calculate the mean of the 1000 points. The
> result would be a spatial
> > reference distribution with a mean value of the random
> permutations at each
> > of the 4000 locations.
> >
> > Can you explain this a bit more. At the moment I don't
> see what you are
> > trying to achieve. "calculate the
> mean of the 1000 values at each of the
> > 4000 points" does not seem to make sense.
> >
> >
> >
> >
> >
> >
> >
> > > calculate the mean of the 1000 values at each of
> the 4000
> >
> >
> >
> >
> >
> >
