[R] Vertical subtraction in dataframes
David Winsemius
dwinsemius at comcast.net
Sat Mar 13 03:59:00 CET 2010
On Mar 12, 2010, at 5:27 PM, Sam Albers wrote:
> Hello all,
>
> I have not been able to find an answer to this problem. I feel like
> it might
> be so simple though that it might not get a response.
>
> Suppose I have a dataframe like the one I have copied below (minus the
> 'calib' column). I wish to create a column like calib where I am
> subtracting
> the 'Count' when 'stain' is 'none' from all other 'Count' data for
> every
> value of 'rep'. This is sort of analogous to putting a $ in front of
> the
> number that identifies a cell in a spreadsheet environment.
> Specifically I
> need some like this:
>
> mydataframe$calib <- Count - (Count when stain = none for each value
> rep)
>
> Any thoughts on how I might accomplish this?
>
> Thanks in advance.
>
> Sam
>
> Note: I've already calculated the calib column in gnumeric for
> clarity.
>
> rep Count stain calib
> 1 1522 none 0
> 1 147 syto -1375
> 1 544.8 sytolec -977.2
> 1 2432.6 sytolec 910.6
> 1 234.6 sytolec -1287.4
> 2 5699.8 none 0
> 2 265.6 syto -5434.2
> 2 329.6 sytolec -5370.2
> 2 383 sytolec -5316.8
> 2 968.8 sytolec -4731
> 3 2466.8 none 0
> 3 1303 syto -1163.8
> 3 1290.6 sytolec -1176.2
> 3 110.2 sytolec -2356.6
> 3 15086.8 sytolec 12620
This method does not depend on the ordering which I believe both
solutions so far do require (but it may fail if there is more than one
value satisfying the stain=="none" test). It is an example of what
Spector calls split-apply-bind logic. See below:
> dfrm$calib2 <- unlist( lapply(split(dfrm, dfrm$rep),
function(x) x$calib <- x$Count- x[x$stain == "none",
"Count"]) )
> dfrm
repp Count stain calib calib2
1 1 1522.0 none 0.0 0.0
2 1 147.0 syto -1375.0 -1375.0
3 1 544.8 sytolec -977.2 -977.2
4 1 2432.6 sytolec 910.6 910.6
5 1 234.6 sytolec -1287.4 -1287.4
6 2 5699.8 none 0.0 0.0
7 2 265.6 syto -5434.2 -5434.2
8 2 329.6 sytolec -5370.2 -5370.2
9 2 383.0 sytolec -5316.8 -5316.8
10 2 968.8 sytolec -4731.0 -4731.0
11 3 2466.8 none 0.0 0.0
12 3 1303.0 syto -1163.8 -1163.8
13 3 1290.6 sytolec -1176.2 -1176.2
14 3 110.2 sytolec -2356.6 -2356.6
15 3 15086.8 sytolec 12620.0 12620.0
> dfrm[3,3] <-"none"
> dfrm$calib2 <- unlist( lapply(split(dfrm, dfrm$rep), function(x) x
$calib <- x$Count- x[x$stain=="none", "Count"]) )
Warning message:
In x$Count - x[x$stain == "none", "Count"] :
longer object length is not a multiple of shorter object length
>
--
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list