[BioC] tagwise parameters for negative binomial distribution in edgeR
Davide Cittaro
cittaro.davide at hsr.it
Fri Mar 21 09:36:27 CET 2014
On 21/mar/2014, at 01:47, Gordon K Smyth <smyth at wehi.EDU.AU> wrote:
>>
>> Mmm, actually I would like to identify the sample that is an outlier for
>> a specific gene, that's why I thought I could focus on tagwise
>> distribution.
>
> See Mark Robinson's post.
>
> It depends on your purpose however. Do you want to downweight/ignore
> outliers, or do you want to identify them because they are interesting?
In this case outliers may be relevant, especially the less represented. I'm running the estimateGLMRobustDisp approach (although it takes a loong time)
>>> Any tag with a small prior.df is considered an outlier. You can sort tags
>>> by their prior.df values to select the most significant outliers.
>>
>> Does this identify a tag that is an outlier over all samples?
>
> Basically yes. We distinguish dispersion outliers and observation
> outliers. An observation outlier is an individual count that is an
> outlier (relative to other counts for the same gene). A dispersion
> outlier is a gene that shows much more variability between replicates than
> other genes at the same cpm level. A dispersion outlier may arise from
> one or more observation outliers, but not necessarily. It could also
> arise from systematically larger variability.
Thanks for the explanation.
Best,
d
More information about the Bioconductor
mailing list