[BioC] c version of VSN ? and Wolfgang's answer

Wolfgang Huber huber at ebi.ac.uk
Wed May 2 20:44:36 CEST 2007


Dear Roger,

normalization (in the sense of vsn, and of most other people) is a 
procedure between arrays that have the same features, but different 
samples. What you have is the different feature but the same samples.

Still, you have answered my questions that I asked in response to yours: 
- what version are you using?
- have you seen and used the "subsample" option?

  Best wishes
  Wolfgang

------------------------------------------------------------------
Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber

Roger Liu wrote:
> As suggested by Wolfgang, I send this question to bioconductor mail list and
> I would like to hear other expert's feedback the message is as below with a
> bit modification:
> 
> 
> another question:
> which way you think is a better solution for normalization with using vsn,
> you suggested normalize each array separately, in fact the 30 arrays form a
> whole genome data set, should normalizing them as a whole data set be a
> better way?
> 
>> I have been using vsn package of R to normalize NimbleGen's ENCODE tiling
> arrays
>> (383000 probes), and vsn gave us very good results. Now,We are trying data
>> analysis for NimbleGen whole genome tiling arrays, which is a very large
>> dataset (30 arrays,total more than  22000000 probes). I combined all of
> these
>> array intensity data together, and tried doing within array vsn
> normalization
>> for this combined large dataset. Since in fact the 30 arrays should be
> consider a whole array set
>  Apparently, because of the size of the data
>> set, I can not run vsn packages of R (memory problem). Therefore, I am
>> wondering if you have a c version vsn or any other methods to deal with
> such
>> huge datasets.
>>
>> Thank you very much.
>>
>> roger
> 
> 
> 
> which version are you using? Please consider version >= 2.0.35 (from
> www.bioconductor.org), the function "vsn2" is already written in C. Also
> please consult the function parameter "subsample".
> 
> I think you should normalize each array separately. I have very
> successfully used vsn for tiling arrays with 6.5 Mio probes.
> 
> Also, for this type of question of general interest, please use the
> bioconductor mailing list! (See link on
> http://www.bioconductor.org<https://webmailapp4.cc.utexas.edu/horde-2.2.9-assign/util/go.php?url=http%3A%2F%2Fwww.bioconductor.org&Horde=0be2d1ba79b54c02c71a2cf118f20902>
> )
> That way you get feedback from many other experts as well, other users
> also benefit, and the discussion remains searchable in the archives.
> 
> Best wishes
>    Wolfgang



More information about the Bioconductor mailing list