[BioC] xps - error in normalize - "Error: Length of non-varying units is zero."

Wed Jul 23 22:10:23 CEST 2014

Dear Matt,

When you try to do piecewise processing, which does not reflect the 
usual rma() or mas5() steps, then it is important to read vignette 
'xpsPreprocess.pdf'. Even then it depends on the type of array which 
option(s) you can use for which function. It is important to check the 
verbose output and to do some quality control for each step.

For example your script for background correction, i.e.:

 > # Background correct
 > data_bg_pm <- bgcorrect(data.genome, "Background_Step1", 
filedir=outdir, tmpdir="",
 >                         method="sector", select="pmonly", 
option="correctbg", params=c(0.02, 4, 4, 0),
 >                         exonlevel="all", verbose=TRUE)

results in the following output (where I show only part of the most 
important output):

 >       background statistics:
 >          2598544 cells with minimal intensity 0
 >          2598544 cells with maximal intensity 0

This means that no background was subtracted. To create an image for the 
background would have helped.

In this case you need to do:

 > data_bkgrd <- bgcorrect(data.genome, "Background_All", 
filedir=outdir, tmpdir="",
 >                         method="sector", select="all", 
option="correctbg", params=c(0.02, 4, 4, 0),
 >                         exonlevel="all", verbose=TRUE)

Now you get the following output:

 >       background statistics:
 >          162409 cells with minimal intensity 25.1364
 >          162409 cells with maximal intensity 26.4606

Regarding the 'Probe-level Normalization' step, I am currently not sure 
what the reason for the error that you get is. It does work for 
ivt-arrays, and 'quantile' normalization also works for whole genome 
arrays. I have just tested this again. However, for 'mean' normalization 
I can reproduce your error. I have to investigate, it may simply be a 
problem of finding the right parameters.

If you skip this step then you can do the Summarization as follows:

 > # Summarization
 > data_sum <- summarize(data_bkgrd, "Summary_Step3", filedir=outdir, 
tmpdir="", update = FALSE,
 >                       select="pmonly", method = "medianpolish", 
option = "transcript",
 >                       logbase="log2", params=c(10, 0.01, 1.0), 
exonlevel="core+affx", verbose=TRUE)

As a last step you could even do 'Probeset-level Normalization':

 > data_norm <- normalize(data_sum, "Sum_Norm_Mean", filedir=outdir, 
tmpdir="", update = FALSE,
 >                        select = "separate", exonlevel="core+affx", 
method="mean", option = "transcript:all",
 >                        logbase = "0", refindex = 0, refmethod = 
"mean", params = list(0.02, 0), verbose=TRUE)

Best regards,
Christian

On 7/22/14 12:23 AM, Thornton, Matthew wrote:
> Hello!
>
> I am trying to optimize my data processing with xps. I am getting an error when using the normalize function. It could be due to improper switches.
>
> here is the error:
>
>> data_norm <- normalize(data_bkgrd, "Normalize_Step2", filedir=outdir, tmpdir="", update = FALSE, select = "pmonly", exonlevel="all", method="mean", option = "transcript:all", logbase = "0", refindex = 0, refmethod = "mean", params = list(0.02, 0), verbose=TRUE)
> Opening file </data/met/scmdir/scheme_RaGene20stv1.root> in <READ> mode...
> Creating new file </data/met/RA/21July14/strat/Normalize_Step2.root>...
> Opening file </data/met/RA/21July14/strat/Background_Step1.root> in <READ> mode...
> Preprocessing data using method <preprocess>...
>     Normalizing raw data...
>        normalizing data using method <mean>...
>           filling array <Reference>...
>        normalizing <CTR1_Mix2_25Apr14.int>...
>        setting selector mask for typepm <16316>
>        normalization <Mean>: Scaling factor SF is <0.859736>
>        normalizing <CTR2_Mix2_25Apr14.int>...
>        setting selector mask for typepm <16316>
> Error: Length of non-varying units is zero.
> An error has occured: Need to abort current process.
> Error in .local(object, ...) : error in rwrapper function ‘Normalize’
>
> Here are the lines in my Rscript for piecewise processing. I am using the default settings but it would be nice to know more about how to optimize them.
>
> # Background correct
> data_bkgrd <- bgcorrect(data_raw, "Background_Step1", filedir=outdir, tmpdir="", method="sector", select="pmonly", option="correctbg", params=c(0.02, 4, 4, 0), exonlevel="all", verbose=TRUE)
>
> png(file="Background_Correction_Density_Plot.png", width=600, height=600)
> par(mar=c(6,3,1,1));
> hist(data_bkgrd, add.legend=TRUE)
> dev.off()
>
> # Normalization
> data_norm <- normalize(data_bkgrd, "Normalize_Step2", filedir=outdir, tmpdir="", update = FALSE, select = "all", exonlevel="all", method="mean", option = "transcript:all", logbase = "0", refindex = 0, refmethod = "mean", params = list(0.02, 0), verbose=TRUE)
>
> png(file="Normalization_Density_Plot.png", width=600, height=600)
> par(mar=c(6,3,1,1));
> hist(data_norm, add.legend=TRUE)
> dev.off()
>
> # Summarization
> data_sum <- summarize(data_norm, "Summary_Step3", filedir=outdir, tmpdir="", update = FALSE, select="pmonly", method = "medianpolish", option = "transcript", exonlevel="core+affx", verbose=TRUE)
>
> png(file="Summary_Density_Plot.png", width=600, height=600)
> par(mar=c(6,3,1,1));
> hist(data_sum, add.legend=TRUE)
> dev.off()
>
> Any comments or advice are greatly appreciated!
>
> Thanks!
>
> Matt
>
>
> matthew.thornton at med.usc.edu
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>