[R] Numbers that look equal, should be equal, but if() doesn't see as equal (repost with code included)
Thomas Lumley
tlumley at u.washington.edu
Wed May 28 16:16:53 CEST 2003
On Wed, 28 May 2003, Paul Lemmens wrote:
> Hi!
>
> Apologies for sending the mail without any code. Apparently somewhere along
> the way the .R attachments got filtered out. I have included the code below
> as clean as possible. My original mail is below the code.
I still think you need not to be using ==. You want something like
if ( abs(mean.b-mean.orig)/(epsilon+abs(mean.orig) < epsilon){
You are effectively using epsilon=0, but epsilon=10e-10 should be
adequate.
-thomas
> Thank you again for your time.
> regards,
> Paul
>
> vincentize <- function(data, bins)
> {
> if ( length(data) < 2 )
> {
> stop("The data is really short. Is that ok?");
> }
>
> if ( bins < 2 )
> {
> stop("A number of bins smaller than 2 just really isn't useful");
> }
>
> if ( bins > length(data) )
> {
> stop("This is really unusual, although perhaps possible. If your eally
> know what you're doing, maybe you should disable this check!?.");
> }
>
> ret <- c();
> for ( i in 1:length(data))
> {
> rt <- data[i];
> b <- 0;
> while ( b < bins )
> {
> ret <- c(ret, rt);
> b <- b+1;
> }
> }
>
> ret;
> }
>
>
> binify <- function(data, bins, n)
> {
> if ( bins < 2 )
> {
> stop("Number of bins is smaller than 2. Nothing to split, exiting.");
> }
>
> if ( length(data) < 2 )
> {
> stop("The length of the data is really short. Is that ok?");
> }
>
> if ( bins * n != length(data) )
> {
> stop("Cannot construct bins of equal length.");
> }
>
> t(array(data, c(n,bins)));
> }
>
> mean.bins <- function(data)
> {
> # For the vincentizing procedures in vincentize() and binify(),
> # it made sense to check the data array/vector/matrix. Here,
> # we now just need to check that data is a matrix.
> if ( !is.matrix(data) )
> {
> stop("The data is not in matrix form.");
> }
>
> means <- c();
> bins <- dim(data)[1];
> for (i in 1:bins)
> {
> means <- c(means, mean(data[i,]));
> }
>
> # return a vector of means.
> means;
> }
>
> bins.factor <- function(data, bins)
> {
> if ( !is.data.frame(data) )
> {
> stop("data is not a data frame.");
> }
>
> source('Ratcliff.r', local=TRUE);
> subject.bin.means <- c();
>
> attach(data);
> l <- levels(Cond);
> for ( i in 1:length(l) )
> {
> cat("Calculating bins for factor level ", l[i], ".\n", sep="");
> flush.console();
>
> data <- RT[Cond == l[i]];
> data <- sort(data);
>
> n <- length(data);
> data.vincent <- vincentize(data,bins);
> data.vincent.bins <- binify(data.vincent, bins, n);
> bin.means <- mean.bins(data.vincent.bins);
>
> # FAILING TEST.
> mean.orig <- mean(data);
> mean.b <- mean(bin.means);
> if ( mean.b != mean.orig )
> {
> #cat("mean.b\n", str(mean.b), "mean.orig\n", str(mean.orig));
> flush.console;
> detach(data);
> stop("Something went wrong calculating the bins: means do not equal.");
> }
> subject.bin.means <- c(subject.bin.means, bin.means);
> }
> detach(data);
>
> if ( !length(subject.bin.means) == bins*length(l) )
> {
> stop("Inappropriate number of means calculated.");
> }
> else
> {
> subject.bin.means
> }
> }
>
> ---------- Forwarded Message ----------
> Date: dinsdag 27 mei 2003 14:53 +0200
> From: Paul Lemmens <P.Lemmens at nici.kun.nl>
> To: r-help at stat.math.ethz.ch
> Subject: [R] Numbers that look equal, should be equal, but if() doesn't see
> as equal
>
> Hi!
>
> After a lot of testing and debugging I'm falling silent in figuring out
> what goes wrong in the following.
>
> I'm implementing the Vincentizing procedure that Ratcliff (1979) described.
> It's about calculating RT bins for any distribution of RT data. It boils
> down to rank ordering your data, replicating each data point as many times
> as you need bins and then splitting up the resulting distribution in equal
> bins.
>
> The code that I've written is attached (and not included because it is
> considerable in length due to many comments). Ratcliff.r contains some
> basic functions and distribution.bins.r contains the problematic function
> bins.factor() (problem area marked with 'FAILING TEST'). The final attached
> file is the mock up distribution I made.
>
> The failing test is the check if the mean of the mean RT's for each bin
> equals the mean of the original distribution. These should/are
> mathematically equivalent. Sometimes, however, the test fails. With the
> attached distribution most notably for 4, 7, 8, 9, and 13 bins. Since the
> means are mathematically equivalent IMHO it should not be an issue of this
> particular distribution. As a matter of fact, I also have tested some
> rnorm() distributions and my function also fails on those (albeit a little
> less often than with foobar.txt).
>
> Problem description: if one calculates the bins or bin means by hand, the
> mean of the bin means is visually the same as the overall mean, even with
> options(digits=20), but *still* the test fails.
>
> IMHO it's not my code and neither the distribution I use to test, but
> still, can you point out an obvious failure of my programming or is it
> indeed something of R that I don't yet grasp?
>
> thank you for your help,
> Paul
>
>
> --
> Paul Lemmens
> NICI, University of Nijmegen ASCII Ribbon Campaign /"\
> Montessorilaan 3 (B.01.03) Against HTML Mail \ /
> NL-6525 HR Nijmegen X
> The Netherlands / \
> Phonenumber +31-24-3612648
> Fax +31-24-3616066
>
>
> ---------- End Forwarded Message ----------
>
>
>
>
> --
> Paul Lemmens
> NICI, University of Nijmegen ASCII Ribbon Campaign /"\
> Montessorilaan 3 (B.01.03) Against HTML Mail \ /
> NL-6525 HR Nijmegen X
> The Netherlands / \
> Phonenumber +31-24-3612648
> Fax +31-24-3616066
>
>
Thomas Lumley Asst. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
^^^^^^^^^^^^^^^^^^^^^^^^
- NOTE NEW EMAIL ADDRESS
More information about the R-help
mailing list