[R] Mann-Whitney by group

arun smartpink111 at yahoo.com
Mon Sep 9 20:57:48 CEST 2013



Hi David:
Try:
unlist(lapply(sara.data[,4:length(sara.data)],function(x){indx1<- 1:(length(x)/2); indx2<- ((length(x)/2)+1):length(x); x1<-data.frame(FH=x[indx1],LH=x[indx2],Group1=sara.data$Group[indx1],Group2=sara.data$Group[indx2]); x2<- x1[!(is.na(x1$FH) | is.na(x1$LH)),]; if(nrow(x2)==0) NULL else with(x2,wilcox.test(FH,LH,paired=TRUE, alternative="two.sided")$p.value)    }))
#  Bcl2   Bcl6   Ccl5   Ccr7   Cd27   Cd28 
#0.1250 0.2500 0.1875 0.2500 0.8125 0.8125 


#Cd40 Group 2 values were all NAs, so it was not tested.


A.K.

----- Original Message -----
From: David Chertudi <david.chertudi at gmail.com>
To: arun <smartpink111 at yahoo.com>
Cc: R help <r-help at r-project.org>
Sent: Monday, September 9, 2013 2:29 PM
Subject: Re: [R] Mann-Whitney by group

Hello Arun,

Thanks so much--while I haven't tried it yet, this seems as though it
will be an excellent way to skip the categories (Actb, etc) that have
missing values (NAs).

The second part of my question, which I didn't ask before:  instead of
skipping, is there a way to continue with the wilcoxons even if there
are NA's?  In this dataset, Groups is the two-level factor that sets
up the pairwise comparison, and Pairs is a 5-level factor that pairs
together each instance within the groups.  For Actb, for example, the
3rd and 4th instance of Group 2 are missing.  How would I automate the
procedure of excluding the 3rd and 4th instance in Group 1, and then
running wilcox.test on the remaining three instances (1,2, and 5)?
The exclusions will vary by category (saradata[,4:10]) in the sample I
provided.

Many thanks.

David


--
I drink your milkshake.


On Mon, Sep 9, 2013 at 5:42 AM, arun <smartpink111 at yahoo.com> wrote:
> Hi,
>
> You may try:
> unlist(lapply(sara.data[,4:length(sara.data)],function(x) {x1<-tapply(is.na(x),list(sara.data$Groups),FUN=sum); if(x1[1]!=x1[2]) NULL else wilcox.test(x~sara.data$Groups,paired=TRUE,alternative="two.sided")$p.value}))
> #  Bcl2   Ccl5   Cd27   Cd28
> #0.1250 0.1875 0.8125 0.8125
>
> A.K.
>
>
>
> ----- Original Message -----
> From: David Chertudi <david.chertudi at gmail.com>
> To: R. Michael Weylandt <michael.weylandt at gmail.com>
> Cc: "r-help at r-project.org" <r-help at r-project.org>
> Sent: Sunday, September 8, 2013 11:13 PM
> Subject: Re: [R] Mann-Whitney by group
>
> The time has come to shake the cobwebs off of this analysis.  I have
> more data now and need to run the same tests, the same way as above.
> My question is this--some of the pairs include NAs, and so are gumming
> up the works.  I'm not sure how to exclude them using the lhs ~ rhs
> syntax.  Any ideas here?
>
> Many thanks, as usual.  Data and syntax below.
>
> David
>
>
> sara.data=structure(list(Groups = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
> 2L), Pairs = c(1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L), Actb =
> c(2.2734065552504,
> 1.69621901296377, 1.07836251830772, 1.46314001007756, 1.76537848566894,
> 0.689064098855446, 0.462820758081676, NA, NA, 2.22119254577143
> ), Bcl2 = c(0.12954440593121, 0.0902306425601895, 0.219044589401239,
> 0.103793432483774, 0.119463699676088, 0.112179645963861, 0.136910739776212,
> 0.433953247043377, 0.401539702575691, 0.352218179109105), Bcl6 =
> c(1.78964109252879,
> 1.56011379020288, 0.750029838175481, 1.80189108290585, 1.09372632818505,
> 0.275815381178548, 0.785680035605173, NA, NA, 0.311865838414934
> ), Ccl5 = c(0.140676314771846, 0.103227179167928, 0.210718001043218,
> 0.101548390950462, 0.140625579216236, 0.218846310909471, 0.132902076760262,
> 0.35763207205821, 0.320733407260836, 0.0983004520984843), Ccr7 =
> c(0.116274608274044,
> 0.0623582657156311, 0.111654418769019, 0.110221412062233, 0.0646423645035265,
> 0.0924168984762384, 0.0322085814124609, NA, NA, 0.0315246913534493
> ), Cd27 = c(0.599332581326994, 0.536313800392409, 0.776647646561188,
> 0.511624999868611, 0.481254858629634, 0.365428233004039, 0.30446734845483,
> 0.880574935388197, 1.19362122336861, 0.121581553928565), Cd28 =
> c(0.8476006082089,
> 0.976603410250505, 0.976783190446247, 0.8288118647421, 0.854672311976977,
> 0.576719839424659, 0.4221908111396, 1.22864113852622, 5.19562728663742,
> 0.401610355554234), Cd40 = c(0.209298226865743, 0.0680133680665235,
> 0.0233440779283003, 0.191986570448918, 0.128784506152115, NA,
> NA, NA, NA, NA)), .Names = c("Groups", "Pairs", "Actb", "Bcl2",
> "Bcl6", "Ccl5", "Ccr7", "Cd27", "Cd28", "Cd40"), class = "data.frame",
> row.names = c(NA,
> -10L))
>
> results=apply(saradata[,4:length(saradata)], 2,
>               function(x)
>
> wilcox.test(x~saradata$Groups,paired=TRUE,alternative="two.sided"))
>
> # Extract p-values from saved results
> lapply(results, function(x) x[['p.value']])
>
>
> --
> I drink your milkshake.
>
>
> On Tue, Jul 10, 2012 at 3:13 PM, R. Michael Weylandt
> <michael.weylandt at gmail.com> <michael.weylandt at gmail.com> wrote:
>> Untested, I think you need to lapply() over thing with some sort of extractor:
>>
>> lapply(thing, function(x) x[['p.value']])
>>
>> Michael
>>
>> On Jul 10, 2012, at 3:45 PM, Oxenstierna <david.chertudi at gmail.com> wrote:
>>
>>> This works very well--thanks so much.
>>>
>>> By way of extension:  how would one extract elements from the result object?
>>>
>>> For example:
>>>
>>> thing<=apply(Dtb[,3:10], 2, function(x) wilcox.test(x~Dtb$Group))
>>>
>>> summary(thing)$p.value
>>>
>>> Does not provide a list of p-values as it would in a regression object.
>>> Ideally, I would like to be able to extract the W score and p-value by
>>> A,B,C,...
>>>
>>> Any ideas greatly appreciated!
>>>
>>>
>>> --
>>> View this message in context: http://r.789695.n4.nabble.com/Mann-Whitney-by-group-tp4635618p4636055.html
>>> Sent from the R help mailing list archive at Nabble.com.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list