[R] Correlation matrix for pearson correlation (r,p,BH(FDR))
Peter Langfelder
peter.langfelder at gmail.com
Thu Jun 18 20:52:18 CEST 2015
You have multiple options. I will advertise my own solution - install
the package WGCNA, installation instructions at
http://labs.genetics.ucla.edu/horvath/CoexpressionNetwork/Rpackages/WGCNA/#cranInstall
then you can use the function
cp = corAndPvalue(t(genes), t(features)).
You need to transpose both because the function expects variables in
columns and samples in rows.
This will give you a list whose components include 'cor' (matrix of
the correlation values) and 'p' (matrix of the Student p-values). To
get a matrix of the corresponding FDR, use
fdr = apply(cp$p, 2, p.adjust, method = "fdr")
Hope this helps,
Peter
On Thu, Jun 18, 2015 at 1:19 AM, Sarah Bazzocco <sarah.bazzocco at vhir.org> wrote:
> This post was called "help" before, I changed the Subject.
> Thanks for the comments.
> Here the example: (I have the two lists saved as .csv and I can open them in R)
>
> Sheet one- Genes (10 genes expression, not binary, meaured in 10 cell lines)
>> genes
> Genes Cell.line1 Cell.line2 Cell.line3 Cell.line4 Cell.line5
> 1 KCNAB3 12.02005181 11.1400910 15.60381163 13.44151596 25.37161030
> 2 KCNB1 0.02457449 1.3028535 0.81538294 0.59318327 0.15332321
> 3 KCNB2 0.44791862 0.1060137 0.09864136 0.00000000 0.00000000
> 4 KERA 0.06090217 0.0000000 0.03352993 0.03634781 0.04190912
> 5 KGFLP1 0.02450101 0.0000000 0.00000000 0.00000000 0.00000000
> 6 KGFLP2 0.00000000 0.0000000 0.00000000 0.00000000 0.00000000
> 7 KHDC1 0.00000000 0.0000000 0.00000000 0.00000000 0.00000000
> 8 KHDC1L 2.31894450 2.8252262 5.29099724 7.44183228 1.94629741
> 9 KHDC3L 0.00000000 0.0000000 0.00000000 0.00000000 0.00000000
> 10 KHDRBS1 0.00000000 0.0000000 0.00000000 0.00000000 0.00000000
> Cell.line6 Cell.line7 Cell.line8 Cell.line9 Cell.line10
> 1 8.12373424 7.67506261 24.43776341 18.33244818 9.224225
> 2 4.18181234 1.65268403 5.98346320 1.51423807 0.000000
> 3 0.05857207 0.05945414 0.20733924 0.05830982 0.000000
> 4 0.00000000 0.00000000 0.07752608 0.01585643 16.664245
> 5 0.02563099 0.03902548 0.00000000 0.00000000 0.000000
> 6 0.00000000 0.00000000 0.00000000 0.00000000 0.000000
> 7 0.00000000 0.00000000 0.00000000 0.00000000 0.000000
> 8 8.56022436 7.50838343 7.17964645 3.28602729 0.000000
> 9 0.00000000 0.00000000 0.00000000 0.00000000 3.598534
> 10 0.00000000 0.03081180 0.00000000 0.00000000 2.600173
>
> Sheet two - features (2 features(Growth rate,drug sensitivity for 10 cell lines)
>> features
> Cell.line Cell.line1 Cell.line2 Cell.line3 Cell.line4 Cell.line5
> 1 Growth rate NA NA NA 51.41 NA
> 2 Drug sensitivity 5.03 6.57 8 1.26 3
> Cell.line6 Cell.line7 Cell.line8 Cell.line9 Cell.line10
> 1 41.33 26.76 24.19 NA NA
> 2 1.40 1.88 1.33 5.05 9.12
>
> What I found:
> corr.test {psych}
> corr.test(x, y = NULL, use = "pairwise",method="pearson",adjust="BH",alpha=.01)
> --> I adjusted the original command to what I need (BH insted og holm) and alpha=.01 insted of 0.05.
>
> I would be very happy, if someone could show me how to use this command, in particular how to refer as x and y to the two sheets I have (Genes and Features). I would take it from there.
>
> Thanks a lot in advance.
>
> Sarah
>
>
>
>
>
>
> ----- Original Message -----
> From: "Rainer Schuermann" <Rainer.Schuermann at gmx.net>
> To: "Sarah Bazzocco" <sarah.bazzocco at vhir.org>
> Sent: Thursday, 18 June, 2015 8:14:56 AM
> Subject: Re: [R] help
>
>
>
> Hi Sarah,
>
>
>
> Not an answer to our question but a piece of well intended advice:
>
>
>
> 1. Don't post HTML but plain text. Not only that people will tell you this in a sometimes not very friendly manner - using HTML actually does make posts illegible in this mailing list. Code, and R _is_ code, is always plain text.
>
>
>
> 2. Don't pose an abstract problem - this looks too much like "Can you please do my work for me". Show us what you have tried already, and people will happily jump in and provide their thoughts and advice.
>
>
>
> 3. Always make sure that you ave a reproducible example in your mail, and a set of data of the same type and structure you are using - ideally using dput().
>
>
>
> See further advice here
>
>
>
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> and here:
>
>
>
> http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
>
>
>
> For your problem, R has an immense wealth of ideas and solutions.
>
>
>
> Rgds,
>
> Rainer
>
>
>
>
>
>
>
> On Wed June 17 2015 16:57:24 Sarah Bazzocco wrote:
>
>>
>
>> Hello,
>
>>
>
>> �
>
>>
>
>> I am a R-beginner and I need some help.�The question is very simple: I need to do a pearson correlations (r,p-value and FDR with BH) from an Expression array (with several thousand genes for lets say 20 cell lines)�with some features of those cell lines.
>
>>
>
>>
>
>>
>
>> My problem I have is the organization of the excel sheets and how to introduce the data into R and run the script. I though the easiest and more organized for me would be two expcel sheets:
>
>>
>
>> 1- Only Expression data (in rows the�genes and in colums cell lines)
>
>>
>
>> 2- Only the features (In row the features (e.g. a) growth rate, b) sensitivity to some drugs) and in columns the cell lines).
>
>>
>
>>
>
>>
>
>> -->That would creat both sheets with 20 colums.
>
>>
>
>>
>
>>
>
>> Now I would like to get a correlation of the gene 1: the expression of all lines with the growth rate.
>
>>
>
>> the same for gene2... and soforth. I sould obtain as many r,p and BH(FDR) as genes there are.
>
>>
>
>> the same I would need to do for the sensitivity... and so on.
>
>>
>
>>
>
>>
>
>> Do you think this is doable? I am not at all a bioinformatic expert, so all help is very welcome.
>
>>
>
>>
>
>>
>
>> Thank you very much!
>
>>
>
>>
>
>>
>
>> Kind regards,
>
>>
>
>>
>
>>
>
>> Sarah
>
>>
>
>>
>
>>
>
>>
>
>
>
> --
>
>
> Sarah Bazzocco, PhD student
> Group of Molecular Oncology,
> CIBBIM-Nanomedicine,
> Vall d'Hebron Hospital Research Institute,
> Passeig Vall d'Hebron 119-129,
> Barcelona 08035, Spain.
> Tel: +34-93-489-4056
>
> Fax: +34-93-489-3893
> Email: sarah.bazzocco at vhir.org
>
>
>
> --
>
>
> Sarah Bazzocco, PhD student
> Group of Molecular Oncology,
> CIBBIM-Nanomedicine,
> Vall d'Hebron Hospital Research Institute,
> Passeig Vall d'Hebron 119-129,
> Barcelona 08035, Spain.
> Tel: +34-93-489-4056
>
> Fax: +34-93-489-3893
> Email: sarah.bazzocco at vhir.org
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list