[R] Q: Suggestions for long-term data/program storage policy?
sosman
sourceforge at metrak.com
Tue Oct 11 11:02:49 CEST 2005
Alexander Ploner wrote:
> Dear list,
>
> we are a statistical/epidemiological departement that - after a few
> years of rapid growth - finally is getting around to formulate a
> general data storage and retention policy - mainly to ensure that we
> can reproduce results from published papers/theses easier in the
> future, but also with the hope that we get more synergy between
> related projects.
>
> We have formulated what we feel is a reasonable draft, requiring
> basically that the raw data, all programs to create derived data
> sets, and the analysis programs are stored and documented in a
> uniform manner, regardless of the analysis software used. The minimum
> data retention we are aiming for is 10 years, and the format for the
> raw data is quite sane (either flat ASCII or real
>
> Given the rapid devlopment cycle of R, this suggests that at the very
> least all non-base packages used in the analysis are stored together
> with each project. I have basically two questions:
>
> 1) Are old R versions (binaries/sources) going to be available on
> CRAN indefinitely?
>
> 2) Is .RData a reasonable file format for long term storage?
>
> I would also be very grateful for any other suggestions, comments or
> links for setting up and implementing such a storage policy (R-
> specific or otherwise).
I am coming more from a software development angle but you might want to
take a look at subversion for versioning your projects. For non-geeky
types, TortoiseSVN has a point and click interface.
It handles binary files efficiently and you can easily go back and get
earlier versions of your projects.
http://subversion.tigris.org/
More information about the R-help
mailing list