[BioC] SomaticSignatures

Dan Tenenbaum dtenenba at fhcrc.org
Thu Mar 27 23:27:49 CET 2014


Steve, thanks for the nice summary; much preferable to my dribs and drabs, it's what I should have done. Of course I have more to add (defeating your purpose of summarizing but adding greatly to my sent email count). I'll try to keep it all in this one email. See below:

----- Original Message -----
> From: "Steve Lianoglou" <lianoglou.steve at gene.com>
> To: "Dan Tenenbaum" <dtenenba at fhcrc.org>
> Cc: "Huma Asif" <humaasif79 at yahoo.com>, "bioconductor at r-project.org list" <bioconductor at r-project.org>
> Sent: Thursday, March 27, 2014 3:10:15 PM
> Subject: Re: [BioC] SomaticSignatures
> 
> So, to wrap this all up for the OP.
> 
> To identify the packages needed for installation of
> SomaticSignatures,
>  you will need to be running R-3.1-alpha on an internet connected
> machine and enter these commands:
> 
> library(tools)
> library(utils)
> library(BiocInstaller)
> pdb <- available.packages(contrib.url(biocinstallRepos()))
> pkgs <- unlist(unname(package_dependencies("SomaticSignatures",
> db=pdb)))

A very important addition: please add recursive=TRUE to the above line as arguments to package_dependencies, so:

pkgs <- unlist(unname(package_dependencies("SomaticSignatures", 
db=pdb, recursive=TRUE)))

This doesn't include SomaticSignatures itself so do this here:

pkgs <- c("SomaticSignatures", pkgs)



> 
> Then to download the packages into the *current working directory*
> that R is executing in, he would then do:

(The OP is a she in this case, I believe)

> 
> download.packages(pkgs, '.', repos=biocinstallRepos(), type="source")
> 
> These copies could then be copied to the cluster and will have to be
> installed individually (ensuring that the cluster already has
> R-3.1-alpha already installed!).
> 
> The OP will have to be careful to install them in the correct order,
> which might require some trial and error, but at least all of the
> packages will be available on the filesystem to do so.

I *think* you could do this, assuming that all the .tar.gz files are in the current directory:

install.packages(dir('.', pattern=".tar.gz$"), repos=NULL, type="source")

This will attempt to install ALL the package tarballs in the directory, and will (hopefully)
figure out all the dependencies and install things in the right order.


One note that I'm not sure has been mentioned: these steps wil install all the R dependencies, but if
there are any *system* dependencies, you need to make sure those are installed on your cluster node.

I notice that pkgs (above) includes XML and RCurl so this means installing the libxml2 and curl-config libraries. If you are on ubuntu this is done with

sudo apt-get install libxml2-dev libcurl-dev

Of course this also assumes internet access, and I honestly don't know how to install ubuntu packages offline but I'm sure google can help. 


> 
> OP: let us know how that goes.
> 
> On a side note: that's pretty cool! Perhaps parts of this would make
> handy utility functions in the BiocInstaller package ... I could
> imagine a function to "identify dependent packages" being useful.

The main workhorses are already in tools and utils. But I agree this could be useful, not sure
BiocInstaller is the right place for it. This issue tends to come up now and then; maybe we need a package that is addressed at the specific case of "offline" installations.

Dan


> 
> -steve
> 
> 
> On Thu, Mar 27, 2014 at 2:01 PM, Dan Tenenbaum <dtenenba at fhcrc.org>
> wrote:
> >
> >
> > ----- Original Message -----
> >> From: "Dan Tenenbaum" <dtenenba at fhcrc.org>
> >> To: "Steve Lianoglou" <lianoglou.steve at gene.com>
> >> Cc: "Huma Asif" <humaasif79 at yahoo.com>,
> >> "bioconductor at r-project.org list" <bioconductor at r-project.org>
> >> Sent: Thursday, March 27, 2014 1:59:27 PM
> >> Subject: Re: [BioC] SomaticSignatures
> >>
> >>
> >>
> >> ----- Original Message -----
> >> > From: "Dan Tenenbaum" <dtenenba at fhcrc.org>
> >> > To: "Steve Lianoglou" <lianoglou.steve at gene.com>
> >> > Cc: "Huma Asif" <humaasif79 at yahoo.com>,
> >> > "bioconductor at r-project.org
> >> > list" <bioconductor at r-project.org>
> >> > Sent: Thursday, March 27, 2014 1:49:13 PM
> >> > Subject: Re: [BioC] SomaticSignatures
> >> >
> >> >
> >> >
> >> > ----- Original Message -----
> >> > > From: "Steve Lianoglou" <lianoglou.steve at gene.com>
> >> > > To: "Dan Tenenbaum" <dtenenba at fhcrc.org>
> >> > > Cc: "Huma Asif" <humaasif79 at yahoo.com>,
> >> > > "bioconductor at r-project.org
> >> > > list" <bioconductor at r-project.org>
> >> > > Sent: Thursday, March 27, 2014 1:45:18 PM
> >> > > Subject: Re: [BioC] SomaticSignatures
> >> > >
> >> > > On Thu, Mar 27, 2014 at 1:39 PM, Dan Tenenbaum
> >> > > <dtenenba at fhcrc.org>
> >> > > wrote:
> >> > > [snip]
> >> > > >>
> >> > > >> Why is installing it via `biocLite('SomaticSignatures')`
> >> > > >> out
> >> > > >> of
> >> > > >> the
> >> > > >> question here? You obviously have access to an internet
> >> > > >> connection
> >> > > >> --
> >> > > >> you are sending email and downloading the dependent
> >> > > >> packages
> >> > > >> by
> >> > > >> hand
> >> > > >> --
> >> > > >> so, why don't you just install it that way?
> >> > > >
> >> > > > I think the machine in question (a cluster?) has no direct
> >> > > > access
> >> > > > to the internet, so packages
> >> > > > must be downloaded to some other machine and then copied
> >> > > > there.
> >> > >
> >> > > I see.
> >> > >
> >> > > >
> >> > > > Though usually we direct people in the opposite direction,
> >> > > > this
> >> > > > StackOverflow post may help:
> >> > > >
> >> > > > https://stackoverflow.com/questions/19268515/installing-bioconductor-without-internet/19269962#19269962
> >> > > >
> >> > > > If the machine that has internet and the machine that
> >> > > > doesn't
> >> > > > are
> >> > > > similar enough and have the same libraries installed, this
> >> > > > could
> >> > > > work. You could also create your own internal CRAN and BioC
> >> > > > mirrors though this may be overkill, but our mirror page
> >> > > > will
> >> > > > provide more info:
> >> > > >
> >> > > > http://www.bioconductor.org/about/mirrors/mirror-how-to/
> >> > >
> >> > > An altertnative would be to "cheat" and follow along with what
> >> > > Vincent did.
> >> > >
> >> > > You could start on your own machine w/ a brand new empty
> >> > > R-3.1-alpha
> >> > > install.
> >> > >
> >> > > Then do:
> >> > >
> >> > > R> source('http://bioconductor.org/biocLite.R')
> >> > > R> biocLite('SomaticSignatures')
> >> > >
> >> > > You will then see all of the packages that were downloaded to
> >> > > satisfy
> >> > > the dependency tree required for the installation (here are
> >> > > just
> >> > > three
> >> > > of them that Vincent required):
> >> > >
> >> > > """
> >> > > trying URL
> >> > > 'http://cran.fhcrc.org/src/contrib/gridBase_0.4-7.tar.gz'
> >> > > Content type 'application/x-gzip' length 153373 bytes (149 Kb)
> >> > > opened URL
> >> > > ==================================================
> >> > > downloaded 149 Kb
> >> > >
> >> > > trying URL
> >> > > 'http://cran.fhcrc.org/src/contrib/NMF_0.20.5.tar.gz'
> >> > > Content type 'application/x-gzip' length 1763782 bytes (1.7
> >> > > Mb)
> >> > > opened URL
> >> > > ==================================================
> >> > > downloaded 1.7 Mb
> >> > >
> >> > > trying URL '
> >> > > http://bioconductor.org/packages/2.14/bioc/src/contrib/pcaMethods_1.53.4.tar.gz
> >> > > '
> >> > > """
> >> > >
> >> > > Now you have the packages required (here he needed gridBase,
> >> > > NMF
> >> > > and
> >> > > pcaMethods) *as well as* the URLs required to download them.
> >> > >
> >> > > Go back and pull out all of the URLs
> >> > >
> >> > > Download the packages.
> >> > >
> >> >
> >> > A trick I learned from Martin yesterday may help with this step
> >> > (eliminating the need to know URLs):
> >> >
> >> > library(BiocInstaller)
> >> > download.packages(c("SomaticSignatures", "gridBase", "NMF",
> >> > "pcaMethods"),
> >> > repos=biocinstallRepos(), type="source")
> >> >
> >> > This will download all the source package tarballs to your
> >> > current
> >> > directory.
> >> >
> >> > As Steve and Vince point out, your actual list of packages that
> >> > you'd
> >> > need to download would be longer.
> >> >
> >> > I think there is another trick to tell you the recursive
> >> > dependencies
> >> > of a package, but I don't know if off the top of my head.
> >>
> >> Here it is:
> >>
> >> library(BiocInstaller)
> > #oops, you need to do this here:
> > library(tools)
> > library(utils)
> >
> > Dan
> >
> >> pdb <- available.packages(contrib.url(biocinstallRepos()))
> >> pkgs <- unlist(unname(package_dependencies("SomaticSignatures",
> >> db=pdb)))
> >>
> >> Then use pkgs as the first argument to download.packages() above.
> >> Dan
> >>
> >>
> >> >
> >> > Dan
> >> >
> >> >
> >> > > Move them over to your cluster.
> >> > >
> >> > > Then install by the command line as you like *into
> >> > > R-3.1-alpha*
> >> > >
> >> > > HTH,
> >> > > -steve
> >> > >
> >> > > --
> >> > > Steve Lianoglou
> >> > > Computational Biologist
> >> > > Genentech
> >> > >
> >> >
> >> > _______________________________________________
> >> > Bioconductor mailing list
> >> > Bioconductor at r-project.org
> >> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> >> > Search the archives:
> >> > http://news.gmane.org/gmane.science.biology.informatics.conductor
> >> >
> >>
> >> _______________________________________________
> >> Bioconductor mailing list
> >> Bioconductor at r-project.org
> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >> Search the archives:
> >> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >>
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at r-project.org
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
> > http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> 
> 
> --
> Steve Lianoglou
> Computational Biologist
> Genentech
>



More information about the Bioconductor mailing list