[R] Statistical analysis of a large database
Vito Ricci
vito_ricci at yahoo.com
Tue Oct 12 10:11:48 CEST 2004
Hi,
for your analysis use the package:
ROracle Oracle database interface for R
http://microarrays.unife.it/CRAN/src/contrib/Descriptions/ROracle.html
see also:
Diego Kuonen, Introduction au data mining avec R :
vers la reconquête du `knowledge discovery in
databases' par les statisticiens. Bulletin of the
Swiss Statistical Society, 40:3-7, 2001.
http://www.statoo.com/en/publications/2001.R.SSS.40/
Diego Kuonen and Reinhard Furrer, Data mining avec R
dans un monde libre. Flash Informatique Spécial Ãté,
pages 45-50, sep 2001.
http://sawww.epfl.ch/SIC/SA/publications/FI01/fi-sp-1/sp-1-page45.html
R Development Core Team, R Data Import/Export,
versione 1.9.0, aprile 2004, pagg. 11-18
http://cran.r-project.org/doc/manuals/R-data.pdf
Brian D. Ripley, Datamining: Large Databases and
Methods, in Proceedings of "useR! 2004 - The R User
Conference", maggio 2004
http://www.ci.tuwien.ac.at/Conferences/useR-2004/Keynotes/Ripley.pdf
Brian D. Ripley, Using Databases with R, R News,
Gennaio 2001, pagg. 18-20
http://cran.r-project.org/doc/Rnews/Rnews_2001-1.pdf
B. D. Ripley, R. M. Ripley, Applications of R Clients
and Servers in Proceedings of the Distributed
Statistical Computing 2001 Workshop, 2001, Vienna
University of Technology.
http://www.ci.tuwien.ac.at/Conferences/DSC-2001/Proceedings/Ripley.pdf
Torsten Hothorn, David A. James, Brian D. Ripley, R/S
Interfaces to Databases in Proceedings of the
Distributed Statistical Computing 2001 Workshop,
2001,Vienna University of Technology.
http://www.ci.tuwien.ac.at/Conferences/DSC-2001/Proceedings/HothornJamesRipley.pdf
LuÃs Torgo, Data Mining with R. Learning by case
studies, Maggio 2003
http://www.liacc.up.pt/~ltorgo/DataMiningWithR/
I hope I give you a little help.
Best
Vito
You wrote:
Deall all,
We need to perform a statistical analysis of a large
database (40,000 entries with approximately 500 fields
in each entry) currently handled in Oracle. The data
contains categorical variables only.
At the current stage we suggest classification and
clustering analysis.
We are planning to perform the analysis in R and
would be very grateful for any
recommendations/suggestions/references regarding the
packages/tools appropriate for this task.
Thank you in advance for your attention,
Vicky Landsman
=====
Diventare costruttori di soluzioni
"The business of the statistician is to catalyze
the scientific learning process."
George E. P. Box
Visitate il portale http://www.modugno.it/
e in particolare la sezione su Palese http://www.modugno.it/archivio/cat_palese.shtml
More information about the R-help
mailing list