[BioC] finding differential expression pattern between paired columns of data frame.

Federico Lasa felasa at gmail.com
Wed Aug 6 19:29:07 CEST 2014


You could create a statistic that captures what you want. i.e:

s = LineAD / (abs(LineBC) )

this number would be big if lineAD foldchange is big and/or line BC is
small. It's not perfect but may somewhat capture what you want. You
can alter it as well to prioritize something particular, like:

s = LineAD^2 / (abs(LineBC)+1)

would give more 'weight' to big AD changes and avoid the stat
'exploding' when BC change is near zero.

p.s. If data is log measured ¿shouldn't you compute FC by substraction?
p.p.s. you may want to rank on the absolute value of the fold change
as well, and later look at the sign.




On Wed, Aug 6, 2014 at 11:10 AM, Jakub Stanislaw Nowak
<jakub.nowak at ed.ac.uk> wrote:
> Hi Jim,
>
> Thanks for your reply. It become clear for me.
>
> I computed the fold changes and added them to the data frame
>
>> LineA.LineD <- data.selected.list$LineA / data.selected.list$LineD
>> LineB.LineC <- data.selected.list$LineB/data.selected.list$LineC
>> data.merged <- cbind(data.selected.list, LineAD=LineA.LineD, LineBC=LineB.LineC)
>> View(data.merged)
>> data.merged
>                 ID       LineA       LineB      LineC      LineD       LineAD     LineBC
> 1   mmu-miR-23a-5p -2.28461513 -2.75356783 -2.0132004  0.1397216 -16.35119386 1.36775647
> 2    mmu-miR-22-3p  1.65924854  0.03717303  0.4909068  2.4100528   0.68846978 0.07572319
> 3     mmu-miR-5113 -0.91337275 -0.87022608 -1.3224488  1.3906263  -0.65680676 0.65804140
> 4   mmu-miR-324-3p  1.44133874 -0.07767235 -0.2187064 -0.2201993  -6.54561061 0.35514432
> 5 mmu-miR-365-1-5p -0.03644659  0.72755490  1.0445618  1.3071159  -0.02788321 0.69651688
> 6  mmu-miR-3068-5p  0.49474466 -0.28506641 -0.2995997  1.3754639   0.35969294 0.95149110
>>
>
> I want to ask for suggestion what would be the best way to sort this data frame now by:
> 1. Descending by fold changes in LineAD column
> 2. In the same time removing these miRNAs that fold change do not differ so much in LineBC column
>
> Many thanks for your great help,
>
> Jakub
>
>
> On 6 Aug 2014, at 15:26, James W. MacDonald <jmacdon at uw.edu> wrote:
>
>> Hi Jakub,
>>
>> The problem here is that you don't have any replication, so you cannot fit a model. The only thing you can do is compute fold changes, which are not very reliable.
>>
>> Best,
>>
>> Jim
>>
>>
>>
>>
>> On Wed, Aug 6, 2014 at 10:02 AM, Jakub Stanislaw Nowak <jakub.nowak at ed.ac.uk> wrote:
>> Hello Bioconductor list members,
>>
>> I am trying to analyse output of my smallRNA sequencing data. My starting point is a table with differentially expressed miRs between two stages of different cell types.
>> My goal is to identify miRs that are changing in LineA - Line D and not changing in LineB - LineC and reciprocal situation.
>>
>> Below is a head of my data.frame that I work on.
>>
>> > data.selected.list
>>                 ID                      Line A        Line B        Line C     Line D
>> 1   mmu-miR-23a-5p -2.28461513 -2.75356783 -2.0132004  0.1397216
>> 2    mmu-miR-22-3p  1.65924854  0.03717303  0.4909068  2.4100528
>> 3     mmu-miR-5113 -0.91337275 -0.87022608 -1.3224488  1.3906263
>> 4   mmu-miR-324-3p  1.44133874 -0.07767235 -0.2187064 -0.2201993
>> 5 mmu-miR-365-1-5p -0.03644659  0.72755490  1.0445618  1.3071159
>> 6  mmu-miR-3068-5p  0.49474466 -0.28506641 -0.2995997  1.3754639
>>
>> As this very much looked for me as microarray experiment I originally was thinking to apply some tools for microarray analysis to get this done.
>>
>> So i did as below
>> 1. Constructed matrix
>> > design <- model.matrix(~-1+factor(c(1,2,3,4)))
>> > colnames(design) <- c("LineA","LineB","LineC","LineD")
>> > design
>>   LineA LineB LineC LineD
>> 1     1     0     0     0
>> 2     0     1     0     0
>> 3     0     0     1     0
>> 4     0     0     0     1
>> attr(,"assign")
>> [1] 1 1 1 1
>> attr(,"contrasts")
>> attr(,"contrasts")$`factor(c(1, 2, 3, 4))`
>> [1] “contr.treatment”
>>
>> 2. Constructed comparison matrix
>> > contrast.matrix <- makeContrasts(LineA-LineD, LineB-LineC, levels=design)
>> > contrast.matrix
>>        Contrasts
>> Levels  LineA - LineD LineB - LineC
>>   LineA             1             0
>>   LineB             0             1
>>   LineC             0            -1
>>   LineD            -1             0
>>
>> 3. Perform fitting and of course got benched for my naive thinking that this is going to work
>>
>> > fit <- lmFit(data.selected[,(2:5)],design)
>> > fit2 <- contrasts.fit(fit, contrast.matrix)
>> > fit2 <- eBayes(fit2)
>> Error in ebayes(fit = fit, proportion = proportion, stdev.coef.lim = stdev.coef.lim,  :
>>   No residual degrees of freedom in linear model fits
>>
>> I thought that maybe there is something wrong with the input for my fitting and I checked that limma is using normalised log data so I followed my very naive thinking and try to normalise the table
>>
>> > eset<-rma(data.selected.list[,2:5])
>> Error in (function (classes, fdef, mtable)  :
>>   unable to find an inherited method for function ‘probeNames’ for signature ‘"data.frame”’
>>
>> As it hasn’t work I have two questions:
>> 1. Can I somehow edit my data.frame that it will become a better fit for limma package and perform comparison I want.
>> 2. I won’t be surprised if you say that this approach is not valid therefore I will welcome any suggestion about other models/methods I can use to make this comparison for my initial table.
>>
>> Thank you very much for help,
>>
>> Jakub
>>
>>
>> --
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>
>>
>> --
>> James W. MacDonald, M.S.
>> Biostatistician
>> University of Washington
>> Environmental and Occupational Health Sciences
>> 4225 Roosevelt Way NE, # 100
>> Seattle WA 98105-6099
>
>
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list