Ding, Yuan Chun
ycding at coh.org
Fri Sep 5 02:19:47 CEST 2014
Hi Sarah,
Thank you very much for your quick response.
I checked the dist() function. It calculate distance between two samples with a number of variables.
Variable1 variable 2 variable 3 variable4 ....
X 3 5 6 7
Y 4 8 9 10
So it is easy to calculate distance between x and y.
But in my study, X is a group with 20 samples and y is another group with 30 samples, so I need to calculate distance between x group and y group.
I think I need to calculate a mean for each group, then use dist() function. I tried to find a R package to do it.
Thanks,
Ding
I'd probably start with ?dist
Sarah
>
>
>
> I want to calculate Euclidean distance between 12 populations, in each population there are 20 samples and each sample is measured for 100 genes (these are microarray data; the numbers here are just examples).
> The equation I found is:
> distance = sqrt{[sum(Average of xi -average of yi)^2] /n }, i=1 to n;
> where xi and yi are the expression of gene i over two populations with p and q samples (x1, x2,...,xp), (y1, y2,...,yq), n is the number of genes.
> part of data are pasted below
> row.names pop1.1 pop1.2 pop1.3 pop1.4 pop2.1 pop2.2 pop2.3 pop2.4
> 7A5 5.38194 4.06191 4.88044 5.60383 6.23101 6.53738 4.80336 5.86136
> A1BG 5.15155 4.29441 4.59131 4.90026 4.62908 4.48712 4.73039 4.46208
> A1CF 4.22396 4.14451 4.41465 3.93179 4.89638 4.66109 4.20918 4.48107
> A26C3 12.1969 12.4179 10.9786 11.7659 11.405 11.7594 11.1757 11.8128
> How might one calculate these distances in R with this data structure?
>
>
> Thanks,
>
> Ding
>
