[BioC] edgeR GLM to adjust for batch effect
Ryan Basom
rbasom at fhcrc.org
Thu Mar 27 22:51:12 CET 2014
Thanks for this advice. I have a follow up question though: As
described in the edgeR User's Guide pertaining to adjusting for batch
effects "In this type of analysis, the treatments are compared only
within each batch. The analysis is corrected for baseline differences
between the batches." If some of the batches don't have samples for say
both treatments, how is this compensated for? Though this isn't ideal,
I'd like to get a better sense of what's going on in this scenario.
Thanks,
Ryan
On 03/26/2014 04:36 PM, Ryan C. Thompson wrote:
> You don't necessarily need every condition in every batch for the
> comparison to be effective, but having only one batch in common is not
> good. If I understand correctly, batch 3 would be the dominant
> contributor to the estimates of fold changes in the comparisons that
> you care about, since any other change would be mostly absorbed into
> the batch effects. I think the first step you should take is to fit
> the full model with conditions and batch effect and find out whether
> the batch effects appear to be significant enough to warrant inclusion
> in the model, and if not, then drop them.
>
> -Ryan
>
> On Wed 26 Mar 2014 03:47:42 PM PDT, Ryan Basom [guest] wrote:
>>
>>
>> Hi,
>>
>> I'd like to use a GLM in edgeR to adjust for a batch effect, though
>> only one of my four batches has samples from both groups in the
>> comparisons that I'd like to conduct (pos-nc & neg-nc):
>>
>> 1 2 3 4
>> pos 3 5 9 0
>> neg 5 4 7 0
>> nc 0 0 5 8
>>
>> I suspect that using a GLM in edgeR to adjust for batch will only
>> work properly if there's representation of both groups from a given
>> comparison in every batch, though would like to know if this is
>> otherwise. I see a batch effect using PVCA on just the pos and neg
>> samples, and would like to try to adjust for it somehow. Please advise.
>>
>> Thanks,
>> Ryan
>>
>>
>>
>>
>>
>>
>> -- output of sessionInfo():
>>
>> R version 3.0.3 (2014-03-06)
>> Platform: x86_64-pc-linux-gnu (64-bit)
>>
>> locale:
>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
>> LC_COLLATE=en_US.UTF-8
>> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
>> LC_PAPER=en_US.UTF-8 LC_NAME=C
>> [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8
>> LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] splines parallel stats graphics grDevices utils datasets methods
>> base
>>
>> other attached packages:
>> [1] pvca_1.2.0 beadChipCoreTools_0.49 beadAnno_1.0 lumi_2.14.1
>> [5] Biobase_2.22.0 BiocGenerics_0.8.0 genefilter_1.44.0
>> arrayQualityMetrics_3.18.0
>> [9] edgeR_3.4.2 limma_3.18.12
>>
>> --
>> Sent via the guest posting facility at bioconductor.org.
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list