[R] What's the best way to tell a function about relevant fields in data frames
Duncan Murdoch
murdoch at stats.uwo.ca
Tue May 12 12:55:54 CEST 2009
On 12/05/2009 6:18 AM, Titus von der Malsburg wrote:
> Hi list,
>
> I have a function that detects saccadic eye movements in a time series
> of eye positions sampled at a rate of 250Hz. This function needs
> three vectors: x-coordinate, y-coordinate, trial-id. This information
> is usually contained in a data frame that also has some other fields.
> The names of the fields are not standardized.
>
>> head(eyemovements)
> time x y trial
> 51 880446504 53.18 375.73 1
> 52 880450686 53.20 375.79 1
> 53 880454885 53.35 376.14 1
> 54 880459060 53.92 376.39 1
> 55 880463239 54.14 376.52 1
> 56 880467426 54.46 376.74 1
>
> There are now several possibilities for the signature of the function:
>
> 1. Passing the columns separately:
>
> detect(eyemovements$x, eyemovements$y, eyemovements$trial)
>
> or:
>
> with(eyemovements,
> detect(x, y, trial))
I'd choose this one, with one modification described below.
>
> 2. Passing the data frame plus the names of the fields:
>
> detect(eyemovements, "x", "y", "trial")
I think this is too inflexible. What if you want to temporarily change
one variable? You don't want to have to create a whole new dataframe,
it's better to just substitute in another variable.
>
> 3. Passing the data frame plus a formula specifying the relevant
> fields:
>
> detect(eyemovements, ~x+y|trial)
>
> 4. Passing a formula and getting the data from the environment:
>
> with(eyemovements,
> detect(~x+y|trial))
Rather than 3 or 4, I would use the more common idiom
detect(~x+y|trial, data=eyemovements)
(and the formula might be x+y~trial). But I think the formula interface
is too general for your needs. What would ~x+y+z|trial mean?
I'd suggest something like 1 but using the convention plot.default()
uses, where you have x and y arguments, but y can be skipped if x is a
matrix/dataframe/formula/list. It uses the xy.coords() function to do
the extraction.
Duncan Murdoch
>
> I saw instances of all those variants (and others) in the wild.
>
> Is there a canonical way to tell a function which fields in a data
> frame are relevant? What other alternatives are possible? What are
> the pros and cons of the alternatives?
>
> Thanks, Titus
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list