[BioC] Error using pdInfoBuilder v1.14.1 for NimbleGen Array

Fri Feb 11 18:40:48 CET 2011

it expects both NDF and POS files to have the "SEQ_ID" and "PROBE_ID" fields.

it should be okay if not all probes are in the POS.

the message saying that 2 rows were added to the featureSet table
brings me some concerns... that suggests that something unexpected is
present (my guess is actually that it is absent) in the SEQ_ID field
in the NDF file.

the error on adding records to the pmfeature table may be caused by
missing entries in the MISMATCH field (NDF) or unexpected values in
PROBE_CLASS (ndf).

if you prefer, please contact me off-list, so we can sort this out...
and once it's solved, i'll post the solution back here.

benilton

2011/2/11 Kate Turner <kturner at oriongenomics.com>:
> Hello,
>
> After my update, I still find I get the same error message:
>
>
> ============================================================================
> =============================
> Building annotation package for Nimblegen Tiling Array
> NDF: 101219_HG_Orion_SS_CGH.ndf
> POS: NDF_OGHA.pos
> XYS: 467246_Slot2_Cycle1_Orion_2011-02-07_532.xys
> ============================================================================
> =============================
> Parsing file: 101219_HG_Orion_SS_CGH.ndf... OK
> Parsing file: NDF_OGHA.pos... OK
> Merging NDF and POS files... OK
> Parsing file: 467246_Slot2_Cycle1_Orion_2011-02-07_532.xys... OK
> Creating package in /Users/kturner/Desktop//pd.101219.hg.orion.ss.cgh
> Inserting 2 rows into table featureSet... OK
> Inserting 0 rows into table pmfeature... Error in sqliteExecStatement(con,
> statement, bind.data) :
>  bind.data must have non-zero dimensions
>
> My session Info is:
>
>> sessionInfo()
> R version 2.12.1 (2010-12-16)
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] pdInfoBuilder_1.14.1 oligo_1.14.0         oligoClasses_1.12.2
> affxparser_1.22.1
> [5] RSQLite_0.9-4        DBI_0.2-5            Biostrings_2.18.2
> IRanges_1.8.8
> [9] Biobase_2.10.0
>
> loaded via a namespace (and not attached):
> [1] affyio_1.18.0         preprocessCore_1.12.0 splines_2.12.1
> tools_2.12.1
>
> And traceback gives:
>
>> traceback()
> 8: stop("bind.data must have non-zero dimensions")
> 7: sqliteExecStatement(con, statement, bind.data)
> 6: sqliteQuickSQL(conn, statement, bind.data, ...)
> 5: dbGetPreparedQuery(conn, sql_template, bind.data = data)
> 4: dbGetPreparedQuery(conn, sql_template, bind.data = data)
> 3: dbInsertDataFrame(conn, "pmfeature", parsedData[["pmFeatures"]],
>       tiledRegionPmFeatureSchema[["col2type"]], !quiet)
> 2: makePdInfoPackage(seed, destDir = "/Users/kturner/Desktop/")
> 1: makePdInfoPackage(seed, destDir = "/Users/kturner/Desktop/")
>
> Has anyone encountered this error previously?
>
> Does anyone know if pdInfoBuilder expects a particular structure for a .pos
> file. For instance, does there need to be 1 row for each feature on the
> array irrespective of duplication and do blanks need to be annotated with
> N/A for instance.
>
> Many Thanks,
>
> Kate
>
>
> On 2/10/11 2:12 PM, "Vincent Carey" <stvjc at channing.harvard.edu> wrote:
>
>> You will probably get better assistance if you upgrade R to 2.12.1 and
>> then update your pdInfoBuilder which is now at 1.14.1 for release.
>>
>> On Thu, Feb 10, 2011 at 2:56 PM, Kate Turner <kturner at oriongenomics.com>
>> wrote:
>>>
>>> Hello,
>>>
>>> I am trying to build an info package for a NimbleGen Tiling array. The
>>> experiment is a dye swap, so I have a pair of xys files for each sample. The
>>> array is custom built.
>>>
>>> I have:
>>> 1.  xys files generated from pair files by Nimblescan.
>>> 2. An ndf file
>>> 3. A pos file.
>>>
>>> The pos file wasn't generated by NimbleGen. I had to generate it myself and
>>> used a pos file I found on line as an example. I saved my file as a tab
>>> delimited txt and changed the .txt to .pos in Aquamacs.
>>>
>>>
>>> When I try and generate the annotation package I get the following error
>>> message:
>>>
>>>
>>> Building annotation package for Nimblegen Tiling Array
>>> NDF: 101219_HG_Orion_SS_CGH.ndf
>>> POS: NDF_OGHA.pos
>>> XYS: 467246_Slot2_Cycle1_Orion_2011-02-07_532.xys
>>> ============================================================================
>>> =============================
>>> Parsing file: 101219_HG_Orion_SS_CGH.ndf... OK
>>> Parsing file: NDF_OGHA.pos... OK
>>> Merging NDF and POS files... OK
>>> Parsing file: 467246_Slot2_Cycle1_Orion_2011-02-07_532.xys... OK
>>> Creating package in
>>> /Users/kturner/Desktop/Project_I//pd.101219.hg.orion.ss.cgh
>>> Inserting 2 rows into table featureSet... OK
>>> Inserting 0 rows into table pmfeature... Error in sqliteExecStatement(con,
>>> statement, bind.data) :
>>>  bind.data must have non-zero dimensions
>>>
>>> The above error message suggests that the program is having issues creating
>>> the featureSet, pmfeature and bgfeature. I have generated an annotation
>>> package for a custom built expression array and that was successful.  In the
>>> case of the expression annotation package there were >1000 rows in all
>>> tables associated with the build.
>>>
>>> In the output associated with the expression array annotation package, the
>>> ndf and the xys files were merged. I am wondering if the xys file for the
>>> tiling array should also be merged to the tiling array ndf and pos files?
>>>
>>> The global structure of my ndf and xys files for the tiling array are
>>> similar to those for the expression array. Additionally the structure of the
>>> xys looks similar to that in the post by Alex Rodriquez in Feb. 2010, so I
>>> am assuming (perhaps incorrectly) that it isn't the xys or the ndf that are
>>> the issue, but the pos file.
>>>
>>> In my pos file, I have each feature represented, even if it is duplicated. I
>>> also have  features that do not have a known chromosome position, and these
>>> features have a blank for some of their variables in the file.  I have been
>>> told that NimbleGen pos files donšt have to have the same number of features
>>> found on the array itself.
>>>
>>> Does anyone know if pdInfoBuilder expects a particular structure for a .pos
>>> file. For instance, does there need to be 1 row for each feature on the
>>> array irrespective of duplication and do blanks need to be annotated with
>>> N/A for instance.
>>>
>>>
>>> I am new to building  annotation packages and wondered if someone with more
>>> experience could help?
>>>
>>>
>>> My session info is:
>>>
>>> R version 2.11.1 (2010-05-31)
>>> x86_64-apple-darwin9.8.0
>>>
>>> locale:
>>> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>>>
>>> attached base packages:
>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>
>>> other attached packages:
>>> [1] pdInfoBuilder_1.12.0 oligo_1.12.2         oligoClasses_1.10.0
>>> affxparser_1.20.0
>>> [5] RSQLite_0.9-2        DBI_0.2-5            Biostrings_2.16.9
>>> IRanges_1.6.17
>>> [9] Biobase_2.8.0
>>>
>>> loaded via a namespace (and not attached):
>>> [1] affyio_1.16.0         preprocessCore_1.10.0 splines_2.11.1
>>>
>>>
>>> and output of traceback is:
>>>
>>> 8: stop("bind.data must have non-zero dimensions")
>>> 7: sqliteExecStatement(con, statement, bind.data)
>>> 6: sqliteQuickSQL(conn, statement, bind.data, ...)
>>> 5: dbGetPreparedQuery(conn, sql_template, bind.data = data)
>>> 4: dbGetPreparedQuery(conn, sql_template, bind.data = data)
>>> 3: dbInsertDataFrame(conn, "pmfeature", parsedData[["pmFeatures"]],
>>>       tiledRegionPmFeatureSchema[["col2type"]], !quiet)
>>> 2: makePdInfoPackage(seed, destDir = "/Users/kturner/Desktop/")
>>> 1: makePdInfoPackage(seed, destDir = "/Users/kturner/Desktop/")
>>>
>>> Many thanks in advance for any help,
>>>
>>> Kate
>>>
>>> Orion Genomics
>>>
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>
>> --
>> This message was scanned by ESVA and is believed to be clean.
>> Click here to report this message as spam.
>> http://h0stname/cgi-bin/learn-msg.cgi?id=5966727FE3.B04E5
>>
>>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>