[R] Beginner help with retrieving frequency and transforming a matrix

Sean MacEachern sean.maceachern at ARS.USDA.GOV
Fri Mar 28 15:20:33 CET 2008


Hi All,

Just hoping some one can give me a hand with a problem...

I have a dataframe (DF) with about 5 million entries that looks something
like the following:

>DF
    ID  Cl Co  Brd    Ind A AB  AB
1  S-3 IND  A BR_F BR_F01 1  0   0
2  S-3 IND  A BR_F BR_F01 1  0   0
3  S-3 IND  A BR_F BR_F01 1  0   0
4  S-3 IND  A BR_F BR_F01 1  0   0
5  S-3 IND  A BR_F BR_F01 1  0   0
6  S-3 IND  A BR_F BR_F01 0  1   0
7  S-3 IND  A BR_F BR_F02 0  0   1
8  S-3 IND  A BR_F BR_F02 0  1   0
9  S-3 IND  A BR_F BR_F02 1  0   0
10 S-3 IND  A BR_F BR_F02 1  0   0
11 S-3 IND  A BR_F BR_F02 1  0   0
12 S-3 IND  A BR_F BR_F02 1  0   0

I am interested in retrieving the frequency of A for everything with the
same Ind code.

I have initially created a column called 'frq' that calculates the
individual A frequency


>DF$frq=apply(DF,1,function(x) if(x[6]==1)1 else if (x[7]==1)0.5 else 0)

>DF

    ID  Cl Co  Brd    Ind A AB  AB  frq
1  S-3 IND  A BR_F BR_F01 1  0   0   1
2  S-3 IND  A BR_F BR_F01 1  0   0   1
3  S-3 IND  A BR_F BR_F01 1  0   0   1
4  S-3 IND  A BR_F BR_F01 1  0   0   1
5  S-3 IND  A BR_F BR_F01 1  0   0   1
6  S-3 IND  A BR_F BR_F01 0  1   0  0.5
7  S-3 IND  A BR_F BR_F02 0  0   1   0
8  S-3 IND  A BR_F BR_F02 0  1   0  0.5
9  S-3 IND  A BR_F BR_F02 1  0   0   1
10 S-3 IND  A BR_F BR_F02 1  0   0   1
11 S-3 IND  A BR_F BR_F02 0  1   0  0.5
12 S-3 IND  A BR_F BR_F02 1  0   0   1

I've created a new DF that contains the info I'm interested in:

>DF2 = cbind(DF[1],DF[5],DF[9])

>DF2

    ID    Ind frq
1  S-3 BR_F01 1 
2  S-3 BR_F01 1 
...
...
...
11 S-3 BR_F02 0.5
12 S-3 BR_F02 1 


I am wondering is there a method that I can call to calculate the frequency
of A or frq for all individuals with the same Ind code so the DF (matrix)
looks something like the following? (I've saw something in a tut based on
t-tests that I thought would work, but no joy...)


>NewDF

    ID    Ind frq
1  S-3 BR_F01 0.9167
2  S-3 BR_F02 0.6667
 

Further, is there to then transform the matrix to look something like the
following?


>FinalDF

Ind       S-3  S-4  S-5.... S-1000000
BR_F01 0.9167  0.5   1         0.6667
BR_F02 0.6667  0.2   1         0.5
...
...
...
BR_Z98   0.5    1   0.3         1
BR_Z99    1    0.6   1         0.5



Thanks in advance for any help you can offer, and please let me know if
there is any further information I can provide.

Sean


> sessionInfo()
R version 2.6.0 (2007-10-03)
i386-apple-darwin8.10.1

locale:
en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base



More information about the R-help mailing list