[R] left end or right end

Matt Shotwell shotwelm at musc.edu
Thu Jul 1 20:27:59 CEST 2010


Suku, 

Just to clarify, in your table and each of your images, it appears that
the start position of P (start1) is _after_ or at the start position of
Q (start2), and the end position of P (end1) is _before_ or at the end
position of Q (end2). If these positions represent increasing integers,
then start1 >= start2 and end1 <= end2. I will assume this for the
discussion below.
  
You mentioned wanting to know whether the midpoint of P tended to be
greater or lesser than the midpoint of Q. That seems like a good idea,
since the midpoints _must_ be similar when the lengths of P and Q are
similar. Hence, if P and Q are samples from a population, then you may
be interested in the population mean difference in midpoints. We can
denote this mean M:

M = E(mid(P) - mid(Q))

In order to do a classical statistical test, we _need_ a hypothesis
about M, and a rule for rejecting the hypothesis. That's why we use the
term 'hypothesis'. An appropriate hypothesis here might be:

H0: M = 0

or, in words, the mean difference in the P and Q midpoints is zero. A
simple rejection rule for this hypothesis is:

reject H0 when the observed mean difference in P and Q midpoints is
greater than some quantity C, or less than -C.

The trick then is to find C that satisfies some type 1 error
probability, usually 0.05. It's here that I might recommend a bootstrap
procedure.

If, in the end, you reject the hypothesis H0, you can use the sign of
the estimated mean difference in your biological inferences. ...And I'm
still interested to hear what those are. :-) Of course, these are just
my ideas, you really ought to visit a biostatistician for professional
advice.

-Matt



On Thu, 2010-07-01 at 10:24 -0400, ravikumar sukumar wrote:
> There are three possibilities:
> 
> Case1: Left end
> 
> P--------------
> Q--------------------------------------
> 
> Case2: Right end
> 
> P                        --------------
> Q--------------------------------------
> 
> 
> Case3: At mid position
> 
> P        -------------
> A--------------------------------------
> 
> 
> My question is how far my data falls on the all the three cases. Is it
> biased towards case1 or case2 or case3. I have to consider the length of Q
> in the data. Example: start2-start1 =2  and end2-end1 = 3 does not make much
> difference if length of Q is 150000.
> 
> I do not hypothesize, i want to know how my data goes on.
> 
> Thanks and regards
> 
> 
> 
> 
> 
> 
> 
> On Thu, Jul 1, 2010 at 4:05 PM, Jonathan Christensen <dzhonatan at gmail.com>wrote:
> 
> > Hi,
> >
> > You need to define what you want more exactly--what are the possible
> > conclusions (hypotheses) you want to reach? Based on what you've said, I can
> > think of several different approaches you might want, but I'm not sure which
> > one of them you're actually after. For example:
> >
> > Hypothesis A: The distance between the left endpoints of P and Q is less
> > than (or equal to) the distance between the right endpoints.
> > Hypothesis B: The distance between the right endpoints is smaller.
> >
> > This is a simple binomial test, as David Winsemius suggested. In your most
> > recent email, though, it sounds like you want to take into account how much
> > smaller one distance is than the other. This is more complicated.
> >
> > Another option occurred to me: maybe you don't care which end P is close
> > to, you just want to know whether it's close to one of the ends, or
> > somewhere in the middle.
> >
> > Without knowing what exactly you are trying to test, it's very hard for us
> > to help you.
> >
> > Jonathan
> >
> >
> > On Thu, Jul 1, 2010 at 7:45 AM, ravikumar sukumar <
> > ravikumarsukumar at gmail.com> wrote:
> >
> >> Sorry for posting to the R list.
> >>
> >> P                  Q
> >> 12, 28       10, 42
> >> 2, 5           1, 55
> >> 32, 50       22, 63
> >> ..... there are 10000 points of P and Q.
> >> The number of points of P and Q are equal (i,e 10000).
> >>
> >> The interval P always overlaps with Q. i,e start1<start2 and end1<end2.
> >>
> >> mere calculating whether points have this condition will not be
> >> significant start1<start2 and end1<end2 and the length of P that is
> >> length(end1-start1) and Q ie length(end2-start1) differs.
> >>
> >> Example
> >> Case A:
> >>
> >>
> >> Case B:
> >> start2 - start1 =100
> >> end2-end1 = 2
> >>
> >> In the above two cases, P is falling on the right end of Q in case B. But
> >> it
> >> depends on the length(end2-start2). If the length(end2-start2) =15000 in
> >> case of B, then it is almost on the middle point.
> >>
> >> Is there any test or function in R to bring a statistically
> >> significant conclusion that midpoint of P or P itself is falling on the
> >> left
> >> end or right end of Q.
> >>
> >> sorry once again for posting in this list.
> >>
> >> Regards
> >>
> >>        [[alternative HTML version deleted]]
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-- 
Matthew S. Shotwell
Graduate Student
Division of Biostatistics and Epidemiology
Medical University of South Carolina
http://biostatmatt.com



More information about the R-help mailing list