[BioC] GEOquery - was queryGEO fails on GDS files (GEO Datasets)

Ting-Yuan Liu tliu at fhcrc.org
Thu Jan 12 19:39:52 CET 2006


Hi Peter,

For Question 2:  this is because GEOquery is not in the BioC 1.7 release.  
Now it is in the BioC devel (1.8) repository, and it will be built by the 
R devel (2.3) version.  

I think you can ignore the warning message at this stage.  If you really 
concern about this, you can install the R devel version on your XP machine 
and then run GEOquery on it.  We recommend to install BioC devel packages 
on the R devel version, and BioC stable packages on the R stable version.  

HTH,
Ting-Yuan
______________________________________
Ting-Yuan Liu
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
Seattle, WA, USA
______________________________________

On Wed, 11 Jan 2006, Peter wrote:

> Sean Davis wrote:
>  >Peter,
>  >
>  >I have recently uploaded a new package to bioconductor called GEOquery.
> 
> I've had a little play - very nice work.  Cheers.  Just a few 
> queries/questions for you...
> 
> I never did work out how to load the package from the source files, but 
> I noticed there is now a Windows binary package on the website...
> 
> http://www.bioconductor.org/packages/bioc/1.8/html/GEOquery.html
> 
> I downloaded the ZIP file and installed it on Windows XP with R 2.1.1 
> and got the following warning:
> 
> package 'GEOquery' successfully unpacked and MD5 sums checked
> updating HTML package descriptions
> Warning message:
> no package 'file15658' was found in: packageDescription(i, fields = 
> "Title", lib.loc = lib)
> 
> Question One
> ------------
> Is the above "no package" warning important?
> 
> -------------------------------------------------------------------
> 
> Question Two
> ------------
> 
>  > library(GEOquery)
> Warning message:
> package 'GEOquery' was built under R version 2.3.0
> 
> Does the version of R matter?  I assume R version 2.3.0 is the 
> development version of R, as 2.2.1 is the latest official release.
> 
> -------------------------------------------------------------------
> 
> Question Three
> --------------
> 
>  > gds37 <- getGEO('GDS37', destdir="c:/temp/geo")
> trying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/data/gds/soft_gz/GDS37.soft.gz'
> ftp data connection made, file length 132384 bytes
> opened URL
> downloaded 129Kb
> 
> File stored at:
> c:/temp/geo/GDS37.soft.gz
> c:/temp/geo/GDS37.soft.gz
> parsing geodata
> parsing subsets
> ready to return
> 
> Why does it print the file location twice?
> 
> -------------------------------------------------------------------
> 
> Question Four
> -------------
> If I repeat the command getGEO, why does it re-download the file?
> 
>  > gds37 <- getGEO('GDS37', destdir="c:/temp/geo")
> 
> I would personally have written the getGEO code to check in the 
> destination folder for the files GDS37.soft or GDS37.soft.gz and just 
> load the local copy if it existed.
> 
> I know I should use the following instead:
> 
>  > gds37 <- getGEO(filename="c:/temp/geo/gds37.soft.gz")
> 
> 
> -------------------------------------------------------------------
> 
> Question Five
> -------------
> I like how you have handled converting subset information into phenotype 
> data in GDS2eSet.
> 
> Have you considered also parsing the "description" to extract the 
> "Alternative Sample Name" and the "Sample Source"?
> 
> As far as I can tell, all the current NCBI GDS files use the same format 
> for the description lines:
> 
> "Value for SAMPLENAME: ALTNAME; src: SOURCE"
> 
> On the other hand, this is clearly not a "defined field" and is subject 
> to change.  Maybe automatically parse the lines if and only if it 
> follows that format?
> 
> -------------------------------------------------------------------
> 
> Thanks again - GEOquery looks like it will be very handy...
> 
> Peter
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>



More information about the Bioconductor mailing list