[R] quantiles and dataframe

jim holtman jholtman at gmail.com
Fri Sep 14 14:45:53 CEST 2007


I think this does what you want:

> RQ
     A B1  B2   B3
1    1 NA 112   12
2    2 NA 123  123
3    3 NA 324   13
4    4  3  21  535
5    5  4  12   33
6    6  7   1  335
7    7  4  NA 3535
8    8  4  NA   NA
9    9 NA  NA   NA
10  10  5  NA   NA
11  12  4  NA   NA
12  15  2  NA   NA
13  17  3  NA    1
14  63  1  NA    1
15  75 NA  NA   NA
16 100 NA  NA   NA
17 123 NA  NA   NA
18 155 NA  NA   NA
19 166 NA  NA   NA
20 177 NA  NA   NA
> x <- lapply(RQ[-1], function(.col){
+     quantile(RQ[!is.na(.col), 1], probs=c(0, 0.05, 0.95, 1))
+ })
> do.call('cbind', x)
        B1   B2   B3
0%    4.00 1.00  1.0
5%    4.45 1.25  1.4
95%  42.30 5.75 44.6
100% 63.00 6.00 63.0


On 9/14/07, Anders Bjørgesæter <anders.bjorgesater at bio.uio.no> wrote:
> Hi
>
> I have a dataframe, RQ, like this:
>
> A    B1    B2    B3
> 1    NA    112    12
> 2    NA    123   123
> 3    NA    324    13
> 4    3     21    535
> 5    4     12    33
> 6    7     1     335
> 7    4     NA    3535
> 8    4     NA    NA
> 9    NA    NA    NA
> 10    5    NA    NA
> 12    4    NA    NA
> 15    2    NA    NA
> 17    3    NA    1
> 63    1    NA    1
> 75    NA   NA    NA
> 100   NA   NA    NA
> 123   NA   NA    NA
> 155   NA   NA    NA
> 166   NA   NA    NA
> 177   NA   NA    NA
>
> I want to extract min, max, 5% and 95% from A based on the range of the Bs.
>
> Using this:
>
> s1<-A[min(which(!is.na(B1))):max(which(!is.na(B1)))]
> q1<-quantile(s1,probs=c(0,5,95,100,NA)/100)
>
> I manage to get this by changing the B1 manually for each B
>
> B1    B2        B3
> 4.0    1.00     1.00    (min)
> 63.0   6.00     63.00   (max)
> 4.5    4.5      1.65    (5%)
> 40.0   6.00     63.00   (95%)
>
> I tried to use apply like this: s1<-apply(RQ,2,function(x)
> {A[min(which(!is.na(RQ[,2:4]))):max(which(!is.na(RQ[,2:4])))] })
>
> to get the range of each B but that doesn't work.
>
> Also as you see, s1 includes the A where the B's are NA, e.g. for B1 I
> get the 9 at row 9 (4,5,6,7,8,9,10,12,15,17,63) and not
> (4,5,6,7,8,10,12,15,17,63), which I would prefer.
>
> BUT the main question is how can I extract min, max etc. from each B in
> dataframe RQ without using a loop?
>
> Any help is greatly appreciated!
>
> Best Regards
> Anders
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?



More information about the R-help mailing list