[R] Getting frustrated with RMySQL
Ted Byers
r.ted.byers at gmail.com
Wed Oct 15 00:03:26 CEST 2008
That is neat Gabor. Thanks, Ted
Gabor Grothendieck wrote:
>
> The gsubfn package can do quasi perl-style interpolation by
> prefacing any function call with fn$.
>
> library(gsubfn)
> x <- 3
> fn$dbGetQuery(con, "select * from myTable where myColumnA = $x and
> MyColumnB = `2*x` ")
>
> See http://gsubfn.googlecode.com
>
>
> On Tue, Oct 14, 2008 at 5:32 PM, Jeffrey Horner
> <jeff.horner at vanderbilt.edu> wrote:
>> Ted Byers wrote on 10/14/2008 02:33 PM:
>>>
>>> Getting the basic stuff to work is trivially simple. I can connect,
>>> and,
>>> for
>>> example, get everything in any given table.
>>>
>>> What I have yet to find is how to deal with parameterized queries or how
>>> to
>>> do a simple insert (but not of a value known at the time the script is
>>> written - I ultimately want to put my script into a scheduled task, so
>>> the
>>> analysis can be repeated on updated data either daily or weekly).
>>> Using "INSERT INTO myTable (a) VALUES (1)" is simple enough, but what if
>>> I
>>> want to insert a sample number (using, e.g. WEEK(sample_date) as a
>>> sample
>>> identifier) along with the rate parameter estimated using fitdistr to
>>> fit
>>> an
>>> exponential distribution to a dataset, along with its sd? If I were
>>> using
>>> Perl or Java, I'd set up the query similar to "INSERT INTO myTable
>>> (a,b,c)
>>> VALUES (?,?,?)", and then use function calls to set each of the query
>>> parameters. I am having an aweful time finding the corresponding
>>> functions
>>> in RMySQL.
>>
>> I've found the best way to parameterize is using R's sprintf function.
>> For
>> instance, the following query not only parameterizes the variable
>> position,
>> but also the table name:
>>
>> fields <- dbGetQuery(con,sprintf("select field,elem_label from %s_meta
>> where field='%s'",inp$pnid,inp$field))
>>
>> Best,
>>
>> Jeff
>>
>>>
>>> And for the data, the simplest, and most efficient, way to get the data
>>> is
>>> to use a statement like:
>>>
>>> SELECT a,b,c FROM myTable GROUP BY g_id, WEEK(sdate);
>>>
>>> The data is in MySQL, and my analysis needs to be applied independantly
>>> to
>>> each group obtained from a query like this. It appears I can't use a
>>> data
>>> frame since none of the samples are of the same size (lets say the
>>> probability of the samples being the same size in indistinguishable from
>>> 0). Is it possible to put the resultset from such a query into a list of
>>> vectors
>>> that I can iterate over, passing each vector to fitdistr in turn? If
>>> so,
>>> how?
>>>
>>> I know I can get this using Perl (by getting each sample individually
>>> and
>>> writing it to a file, then having R read the file, do the analysis and
>>> write
>>> the output to another file, and then have Perl parse the output file to
>>> insert the parameter estimates I need into the appropriate table), but
>>> that
>>> seems inefficient.
>>>
>>> Is it possible to do all I need with R working directly with MySQL? If
>>> so,
>>> can someone fill in the apparent gaps left in the RMySQL documentation?
>>>
>>> Thanks.
>>>
>>> Ted
>>
>>
>> --
>> http://biostat.mc.vanderbilt.edu/JeffreyHorner
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
--
View this message in context: http://www.nabble.com/Getting-frustrated-with-RMySQL-tp19980592p19983099.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list