[BioC] unable to set sampleNames after combine (from beadarray package) on ExpressionSetIllumina

Mike Smith grimbough at gmail.com
Thu Sep 11 17:41:48 CEST 2014


Thanks for easily reproduced bug report Adai and the neat solution Martin.
 This has been fixed in beadarray version 2.15.4.

Mike

On 11 September 2014 12:17, Adaikalavan Ramasamy <
adaikalavan.ramasamy at gmail.com> wrote:

> That's great. Thanks for sending the solution.
>
> On Wed, Sep 10, 2014 at 8:45 PM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
>
> > Hi Adai --
> >
> >
> > On 09/09/2014 03:52 AM, Adaikalavan Ramasamy wrote:
> >
> >> Dear all,
> >>
> >> Here is a possible bug in the combine() function from beadarray. I read
> >> two
> >> ExpressionSetIllumina objects with 36 samples and 12 samples each. The
> >> combine() function works brilliantly and without errors or warnings but
> I
> >> get error message when I try to change the sample names.
> >>
> >> ## create fake data to read in ##
> >> tmp <- data.frame( PROBE_ID = paste0("P", 1:10), SYMBOL   =
> LETTERS[1:10],
> >>                     S1.AVG_Signal = rnorm(10, mean=7), S2.AVG_Signal =
> >> rnorm(10, mean=8), S3.AVG_Signal = rnorm(10, mean=6) )
> >> write.table(tmp, file="SampleProbeProfile_1.txt", sep="\t", quote=F,
> >> row.names=F)
> >> rm(tmp)
> >>
> >> tmp <- data.frame( PROBE_ID = paste0("P", 1:10), SYMBOL   =
> LETTERS[1:10],
> >>                     S4.AVG_Signal = rnorm(10, mean=9), S5.AVG_Signal =
> >> rnorm(10, mean=6) )
> >> write.table(tmp, file="SampleProbeProfile_2.txt", sep="\t", quote=F,
> >> row.names=F)
> >> rm(tmp)
> >>
> >>
> >> ## Read in and combine ##
> >> raw1 <- readBeadSummaryData(dataFile="SampleProbeProfile_1.txt",
> >> ProbeID="PROBE_ID", columns=list(exprs="AVG_Signal"), skip=0)
> >> raw2 <- readBeadSummaryData(dataFile="SampleProbeProfile_2.txt",
> >> ProbeID="PROBE_ID", columns=list(exprs="AVG_Signal"), skip=0)
> >>
> >> raw  <- combine(raw1, raw2)  # no warnings or error
> >>
> >> dim(raw1)
> >> # Features  Samples Channels
> >> #       10        3        1
> >>
> >> dim(raw2)
> >> # Features  Samples Channels
> >> #      10        2        1
> >>
> >> dim(raw)
> >> # Features  Samples Channels
> >> #       10        5        1
> >>
> >> raw1, raw2 and raw are all of ExpressionSetIllumina class.
> >>
> >>
> >> And here is the problem:
> >>
> >> sampleNames(raw) <- paste0("Sample", 1:5)
> >> # Error in `sampleNames<-`(`*tmp*`, value = c("Sample1", "Sample2",
> >> "Sample3",  :
> >> #  number of new names (5) should equal number of rows in
> >>
> >
> > the problem is that beadarray does not 'combine' the 'protocolData' slot
> > and does not check that the resulting object is valid
> >
> > > validObject(raw)
> > Error in validObject(raw) :
> >   invalid class "ExpressionSetIllumina" object: 1: sample numbers differ
> > between phenoData and protocolData
> > invalid class "ExpressionSetIllumina" object: 2: sampleNames differ
> > between phenoData and protocolData
> >
> > A work-around is to update the protocolData slot yourself, until the
> > package maintainer (cc'd) has a chance to fix the problem.
> >
> > > protocolData(raw) <- combine(protocolData(raw1), protocolData(raw2))
> > > validObject(raw)
> > [1] TRUE
> > > sampleNames(raw) <- 1:5
> > >
> >
> > Thanks for the nice reproducible example
> >
> > Martin
> >
> >  AnnotatedDataFrame (3)
> >>
> >>
> >> Alternatively, I could change the rownames of raw1 and raw2 separately
> and
> >> then combine but I am just curious as to why this error message. Thank
> >> you.
> >>
> >> Regards, Adai
> >>
> >>         [[alternative HTML version deleted]]
> >>
> >> _______________________________________________
> >> Bioconductor mailing list
> >> Bioconductor at r-project.org
> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >> Search the archives: http://news.gmane.org/gmane.
> >> science.biology.informatics.conductor
> >>
> >>
> >
> > --
> > Computational Biology / Fred Hutchinson Cancer Research Center
> > 1100 Fairview Ave. N.
> > PO Box 19024 Seattle, WA 98109
> >
> > Location: Arnold Building M1 B861
> > Phone: (206) 667-2793
> >
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>



-- 
Mike Smith
Research Associate
Statistics & Computational Biology Laboratory
Cambridge University

	[[alternative HTML version deleted]]



More information about the Bioconductor mailing list