[R] Beginner help with retrieving frequency and transforming a matrix
Sean MacEachern
sean.maceachern at ARS.USDA.GOV
Fri Mar 28 15:20:33 CET 2008
Hi All,
Just hoping some one can give me a hand with a problem...
I have a dataframe (DF) with about 5 million entries that looks something
like the following:
>DF
ID Cl Co Brd Ind A AB AB
1 S-3 IND A BR_F BR_F01 1 0 0
2 S-3 IND A BR_F BR_F01 1 0 0
3 S-3 IND A BR_F BR_F01 1 0 0
4 S-3 IND A BR_F BR_F01 1 0 0
5 S-3 IND A BR_F BR_F01 1 0 0
6 S-3 IND A BR_F BR_F01 0 1 0
7 S-3 IND A BR_F BR_F02 0 0 1
8 S-3 IND A BR_F BR_F02 0 1 0
9 S-3 IND A BR_F BR_F02 1 0 0
10 S-3 IND A BR_F BR_F02 1 0 0
11 S-3 IND A BR_F BR_F02 1 0 0
12 S-3 IND A BR_F BR_F02 1 0 0
I am interested in retrieving the frequency of A for everything with the
same Ind code.
I have initially created a column called 'frq' that calculates the
individual A frequency
>DF$frq=apply(DF,1,function(x) if(x[6]==1)1 else if (x[7]==1)0.5 else 0)
>DF
ID Cl Co Brd Ind A AB AB frq
1 S-3 IND A BR_F BR_F01 1 0 0 1
2 S-3 IND A BR_F BR_F01 1 0 0 1
3 S-3 IND A BR_F BR_F01 1 0 0 1
4 S-3 IND A BR_F BR_F01 1 0 0 1
5 S-3 IND A BR_F BR_F01 1 0 0 1
6 S-3 IND A BR_F BR_F01 0 1 0 0.5
7 S-3 IND A BR_F BR_F02 0 0 1 0
8 S-3 IND A BR_F BR_F02 0 1 0 0.5
9 S-3 IND A BR_F BR_F02 1 0 0 1
10 S-3 IND A BR_F BR_F02 1 0 0 1
11 S-3 IND A BR_F BR_F02 0 1 0 0.5
12 S-3 IND A BR_F BR_F02 1 0 0 1
I've created a new DF that contains the info I'm interested in:
>DF2 = cbind(DF[1],DF[5],DF[9])
>DF2
ID Ind frq
1 S-3 BR_F01 1
2 S-3 BR_F01 1
...
...
...
11 S-3 BR_F02 0.5
12 S-3 BR_F02 1
I am wondering is there a method that I can call to calculate the frequency
of A or frq for all individuals with the same Ind code so the DF (matrix)
looks something like the following? (I've saw something in a tut based on
t-tests that I thought would work, but no joy...)
>NewDF
ID Ind frq
1 S-3 BR_F01 0.9167
2 S-3 BR_F02 0.6667
Further, is there to then transform the matrix to look something like the
following?
>FinalDF
Ind S-3 S-4 S-5.... S-1000000
BR_F01 0.9167 0.5 1 0.6667
BR_F02 0.6667 0.2 1 0.5
...
...
...
BR_Z98 0.5 1 0.3 1
BR_Z99 1 0.6 1 0.5
Thanks in advance for any help you can offer, and please let me know if
there is any further information I can provide.
Sean
> sessionInfo()
R version 2.6.0 (2007-10-03)
i386-apple-darwin8.10.1
locale:
en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
More information about the R-help
mailing list