[BioC] c version of VSN ? and Wolfgang's answer
Wolfgang Huber
huber at ebi.ac.uk
Wed May 2 20:44:36 CEST 2007
Dear Roger,
normalization (in the sense of vsn, and of most other people) is a
procedure between arrays that have the same features, but different
samples. What you have is the different feature but the same samples.
Still, you have answered my questions that I asked in response to yours:
- what version are you using?
- have you seen and used the "subsample" option?
Best wishes
Wolfgang
------------------------------------------------------------------
Wolfgang Huber EBI/EMBL Cambridge UK http://www.ebi.ac.uk/huber
Roger Liu wrote:
> As suggested by Wolfgang, I send this question to bioconductor mail list and
> I would like to hear other expert's feedback the message is as below with a
> bit modification:
>
>
> another question:
> which way you think is a better solution for normalization with using vsn,
> you suggested normalize each array separately, in fact the 30 arrays form a
> whole genome data set, should normalizing them as a whole data set be a
> better way?
>
>> I have been using vsn package of R to normalize NimbleGen's ENCODE tiling
> arrays
>> (383000 probes), and vsn gave us very good results. Now,We are trying data
>> analysis for NimbleGen whole genome tiling arrays, which is a very large
>> dataset (30 arrays,total more than 22000000 probes). I combined all of
> these
>> array intensity data together, and tried doing within array vsn
> normalization
>> for this combined large dataset. Since in fact the 30 arrays should be
> consider a whole array set
> Apparently, because of the size of the data
>> set, I can not run vsn packages of R (memory problem). Therefore, I am
>> wondering if you have a c version vsn or any other methods to deal with
> such
>> huge datasets.
>>
>> Thank you very much.
>>
>> roger
>
>
>
> which version are you using? Please consider version >= 2.0.35 (from
> www.bioconductor.org), the function "vsn2" is already written in C. Also
> please consult the function parameter "subsample".
>
> I think you should normalize each array separately. I have very
> successfully used vsn for tiling arrays with 6.5 Mio probes.
>
> Also, for this type of question of general interest, please use the
> bioconductor mailing list! (See link on
> http://www.bioconductor.org<https://webmailapp4.cc.utexas.edu/horde-2.2.9-assign/util/go.php?url=http%3A%2F%2Fwww.bioconductor.org&Horde=0be2d1ba79b54c02c71a2cf118f20902>
> )
> That way you get feedback from many other experts as well, other users
> also benefit, and the discussion remains searchable in the archives.
>
> Best wishes
> Wolfgang
More information about the Bioconductor
mailing list