[BioC] QCReport: specifying alt CDF (MoGene-1_0-st-v1)?

James W. MacDonald jmacdon at med.umich.edu
Tue Sep 21 15:49:38 CEST 2010


Hi Harry,

On 9/20/2010 6:20 PM, Harry Mangalam wrote:
> Hi BioC
>
> (sessionInfo() at bottom)
>
> I'm trying to help a group here do some QC on their affy datasets
> derived from the mogene10stv1 array set.  This array is not in
> mainstream BioC support but I've created and installed the CDF
> environment for that array:

This is not correct.

biocLite("mogene10stv1cdf")

Will get you the package you create below.

>
>>   make.cdf.package("MoGene-1_0-st-v1.r3.cdf", species = "Mus_mus")
> (completes, and I've installed the generated CDF env)
>
> but when I try to run  QCReport on this dataset (even explicitly
> specifying the mogene10stv1 CDF env), I get the errors:

In future, please mention the package you are using. I happen to know 
that QCReport() is part of the affyQCReport package, but by neglecting 
to include this bit of information you seriously degrade your chances of 
an answer.

Now on to the answer. ;-D

You are not going to be very satisfied with affyQCReport for this chip, 
as it uses the simpleaffy package for much of the quality control 
output, a good portion of which is based on MAS5 calls. Since the MoGene 
chip is a PM-only chip, you won't be able to compute MAS5 calls, as they 
rely on the matching MM probes, which don't exist. Hence the NA values 
below.

I believe you will be better off using the arrayQualityMetrics package, 
which is more general.

Best,

Jim



>
>> QCReport(ReadAffy(widget=TRUE,cdfname="mogene10stv1cdf"))
> #   or
>> QCReport(ReadAffy(widget=TRUE,cdfname="mogene10stv1"))
> #   (get same error)
>
> Error: NAs in foreign function call (arg 1)
> In addition: Warning messages:
> 1: In data.row.names(row.names, rowsi, i) :
>    some row.names duplicated:
> 4,8,9,13,14,15,16,24,25,26,27,28,29,30,31,36,37,38,39,47,48,49,50,51,52,53,54,58,59,60,64,65,66,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,102,103,104,108,109,110,111,114,119,120,121,122,127,134,136,137,138,139,141,142,147,148,149,152,153,156,157,158,159,162,163,164,165,166,167,168,169,170,171,173,175,176,179,180,183,184,185,186,191,192,195,197,198,199,200,202,206,207,210,219,220,227,228,229,230,233,234,235,240,241,243,245,246,248,249,250,251,252,253,257,259,260,266,271,272,276,277,280,281,284,286,287,289,290,291,292,296,297,298,302,304,305,306,310,311,312,313,317,318,319,321,322,324,334,337,338,339,340,341,345,346,350,351,356,359,362,364,366,367,370,371,373,376,378,382,383,384,385,386,387,388,389,391,394,395,397,398,399,400,402,403,405,406,407,409,410,411,415,416,418,419,425,431,432,433,434,435,440,441,443,445,447,449,450,452,454,455,456,461,464,466,470,472,473,481,487,488,491,492,493,494,495,496,497,498,499,501,502,504,506,507,509,511,513,515,516,51
> [... truncated]
> 2: In qc.affy(unnormalised, ...) :
>    CDF Environment name ' hgu95av2cdf ' does not match cdfname '
> mogene10stv1cdf '
> Error in plot(qc(object)) :
>    error in evaluating the argument 'x' in selecting a method for
> function 'plot'
>
>
> This: /Error: NAs in foreign function call (arg 1)/
>   seems to imply that:
>
> - there's an error in the '(arg 1)'  but which (arg 1)?
>    If this refers to the arg
> /ReadAffy(widget=TRUE,cdfname="mogene10stv1cdf")/
>    then that part of the command seems to complete fine and returns an
> AffyBatch object as it should
>
>> str(rawdata)
> Formal class 'AffyBatch' [package "affy"] with 10 slots
>    ..@ cdfName          : chr "mogene10stv1cdf"
>    ..@ nrow             : int 1050
>    ..@ ncol             : int 1050
> /etc/
>
>
> - or I have NAs in the data, but doesn't point to where or whether I
> should address them.
> If this is the critical error, I'm guessing I have to choose a
> transform that removes or floor-shifts the NAs into a computational
> form?
>
> - the Warnings:
> 1: In data.row.names(row.names, rowsi, i) :
>    some row.names duplicated: 4,8,9,13,14,15,16,24,25,26,27,28,29,
>    <almost every intervening # omitted>
>    ,513,515,516,51 [... truncated]
>
>
> Would this be related to warning 2 below?
>
>
> 2: In qc.affy(unnormalised, ...) :
>    CDF Environment name ' hgu95av2cdf ' does not match cdfname '
> mogene10stv1cdf '
>
> but if so, what is the proper way to tell QCReport that I'm using a
> non-default CDF?
> the help section for QCReport doesn't describe any params for telling
> it that the CDF env is not 'hgu95av2cdf' and I've tried including that
> info in the ReadAffy() fn as noted:
>
> ie:
>> QCReport(ReadAffy(widget=TRUE,cdfname="mogene10stv1"))
>
>
>
>
>
>> sessionInfo()
> R version 2.11.1 (2010-05-31)
> i486-pc-linux-gnu
>
> locale:
>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>   [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
>   [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] tools     tcltk     stats     graphics  grDevices utils
> datasets
> [8] methods   base
>
> other attached packages:
>   [1] makecdfenv_1.26.0     tkWidgets_1.26.0      DynDoc_1.26.0
>   [4] widgetTools_1.26.0    hgu95av2cdf_2.6.0     affydata_1.11.10
>   [7] affyQCReport_1.26.0   lattice_0.19-11       RColorBrewer_1.0-2
> [10] affyPLM_1.24.1        preprocessCore_1.10.0 xtable_1.5-6
> [13] simpleaffy_2.24.0     gcrma_2.20.0          genefilter_1.30.0
> [16] mogene10stv1cdf_2.6.2 affy_1.26.1           Biobase_2.8.0
>
> loaded via a namespace (and not attached):
>   [1] affyio_1.16.0        annotate_1.26.1      AnnotationDbi_1.10.2
>   [4] Biostrings_2.16.9    DBI_0.2-5            grid_2.11.1
>   [7] IRanges_1.6.17       RSQLite_0.9-2        splines_2.11.1
> [10] survival_2.35-8
>
>
> Thanks for your consideration.
>

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 



More information about the Bioconductor mailing list