[R] Odp: how to generate data set with different length and calculate the mean?
Petr PIKAL
petr.pikal at precheza.cz
Mon Feb 1 13:44:17 CET 2010
Hi
I have no idea how you could do what you want. I only recommend you to use
list instead of matrix as list can incorporate objects with various size
I am not sure if this is the most elegant way but you can make your matrix
a data frame
ddd<- as.data.frame(data)
and than use thist
lapply(ddd, function(x) unlist(list(x)))
To get list of vectors
Regards
Petr
r-help-bounces at r-project.org napsal dne 01.02.2010 03:46:34:
>
> Hello,
>
> This may be a rare question. I am struggling to solve it. I really
> appreciate any help or suggestions. Thanks a lot in advance!
>
>
> I put my questions between the code to make it clear. The problem I have
is:
> I generated 10 data sets with 8 data for each set. Now I want to change
the
> number of data in each dataset according to a vector 'size' (as
follows),
> that is, each new dataset contains different number of data. How can I
do
> it? After generating the new datasets, how can I seperate the data from
two
> distributions and calculate the sample mean? Thanks a lot.
>
>
>
> # generate 10 data sets, each data sets include 8 sample. 4 from N(0, 1)
and
> 4 from N(5, 1)
> data<- matrix(0,10,8)
> th <- c(0, 5, 1)
> for(i in 1:10){
> data[i,] <- rnorm(8,mean= rep(th[1:2],8/2),sd=th[3])
> }
>
> # change the number of samples for each data set. e.g. the first
dataset
> needs to increase to 20, the #first 8 keep the same, add another 12
sample
> (6 from N(0,1) and the other 6 from N(5, 1) ), the second #dataset needs
to
> increase to 10, keep the first 8 the same, generate another 2 (one from
> N(0,1) and the #other one from N(5,1)), the third data set does not
need to
> change. etc.
>
> size=c(20, 10, 8, 14, 16, 12, 8, 80)
>
>
> # Since each data set changes to different size, and add different
number of
> data, for each dataset how #can I calculate the difference of the
sample
> mean from N(0,1) and the sample mean from
> #N(5,1) and the pooled standard deviation of two samples. Two
difficulties:
> each new dataset includes #different number of data; another difficulty,
> when I generated data, the two successive data are
> #from different normal distribution, how can I seperate them and
calculate
> the average for each sample #and pooled standard deviation?
>
>
>
> --
> View this message in context:
http://n4.nabble.com/how-to-generate-data-set-
> with-different-length-and-calculate-the-mean-tp1458420p1458420.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list