[R] Re: New package: g.data

Duncan Temple Lang duncan at research.bell-labs.com
Wed Dec 5 00:12:31 CET 2001

Hi David.

 The idea of your package is very nice. It is something that I have
been thinking about, and generally accessing data from sources other
than R, including different formats, different applications
(databases, Java, Perl, Python, etc.)  and so on.

 I am sorry I didn't see follow up on your original post because I
might have been able to make your life simpler.  As of today, I
committed code to allow one to extend the way R resolves variables in
the search path. This is now in the R-1.4.0 code base which will be
released in 2 weeks time.

I created a package - RObjectTables - that exploits this change to the
way R looks for data in elements of the search path. This allows one
to define a class of search path element much like what you have done
and have R query it using functions. The package is available
and will be moved into the R source after the upcoming release.

I believe it will allow your code to avoid the g.data.save() and also
for users to use a regular attach() call.  In this sense, it would be
nice to have a single interface that we can extend.

If you have any interest in using this, I'd be happy to help.

Thanks for the good work.

David Brahm wrote:
> A new package "g.data" is available on CRAN, to create and maintain databases
> that work more like the S-Plus model.
> Here's the official Description for g.data (v1.2):
>   Create and maintain delayed-data packages (DDP's).  Data stored in
>   a DDP are available on demand, but do not take up memory until requested.
>   You attach a DDP with g.data.attach(), then read from it and assign to it in
>   a manner similar to S-Plus, except that you must run g.data.save() to
>   actually commit to disk.
> Here's a very abbreviated (Unix) example:
>   g.data.attach("/tmp/mydir")             # Open package:mydir in pos=2
>   assign("x1", matrix(1, 1000, 1000), 2)  # Put data there
>   g.data.save()                           # Commit to disk
>   detach(2)                               # Detach package:mydir
>   g.data.attach("/tmp/mydir")             # Re-attach it, no resources used
>   dim(x1)                                 # x1 is loaded only when needed!
>   find("x1")                              # It still lives in package:mydir
> g.data is the end result of my post "Reading and writing to S-like databases",
> sent to R-help on Sep 28, 2001.  Thanks to all who responded, especially
> Dr. Agustin Lobo <alobo at ija.csic.es> and (by reference) Ray Brownrigg
> <Ray.Brownrigg at mcs.vuw.ac.nz>, who suggested using delay(); Martin Maechler
> <maechler at stat.math.ethz.ch>, Thomas Lumley <tlumley at u.washington.edu>, Brian
> D. Ripley <ripley at stats.ox.ac.uk>, and Peter Dalgaard
> <p.dalgaard at biostat.ku.dk>, who helped me with platform independence issues;
> and Kurt Hornik <Kurt.Hornik at ci.tuwien.ac.at>, who cleaned it up for CRAN.
> I have one concern: g.data relies heavily on delay(), whose documentation says:
>   This is an experimental feature and its addition is purely for
>   evaluation purposes.
> Is there any plan to deprecate delay()?
> Feedback is welcome!
> -- 
>                               -- David Brahm (brahm at alum.mit.edu)
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-announce mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-announce-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._


Duncan Temple Lang                duncan at research.bell-labs.com
Bell Labs, Lucent Technologies    office: (908)582-3217
700 Mountain Avenue, Room 2C-259  fax:    (908)582-3340
Murray Hill, NJ  07974-2070       
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list