[R] Offtopic, HT vs. HH in coin flips

Tue Sep 1 11:39:04 CEST 2009

On 31-Aug-09 19:16:33, Erik Iverson wrote:
> Dear R-help, 
> Could someone please try to explain this paradox to me? What is
> more likely to show up first in a string of coin tosses, "Heads
> then Tails", or "Heads then Heads"?  
> 
>##generate 2500 strings of random coin flips
> ht <- replicate(2500,
>                 paste(sample(c("H", "T"), 100, replace = TRUE),
>                       collapse = ""))
> 
>## find first occurrence of HT
> mean(regexpr("HT", ht))+1    #mean of HT position, 4
> 
>## find first occurrence of HH
> mean(regexpr("HH", ht))+1    #mean of HH position, 6
> 
> FYI, this is not homework, I have not been in school in years.
> I saw a similar problem posed in a blog post on the Revolutions R
> blog, and although I believe the answer, I'm having a hard time
> figuring out why this should be? 
> 
> Thanks,
> Erik Iverson

Be very careful about the statement of the problem!

[1] The probability that "HH" will occur first (i.e. before "HT")
is the same as the probability that "HT" will occur first (i.e.
before "HH").

[2] However, the probability that the first occurrence of "HT"
will be on a given position of the "H" is generally not the same
as the probability that the first occurrence of "HH" will be on
the same position of the first "H".

[1]: At the first occurrence of (either "HH" or "HT"), there is
an initial string S, ending in an "H", followed by either an "H"
(for "HH") or a "T" (for "HT"). Both are equally likely.

So the probability that the first occurrence of (either "HH" or "HT")
is an ""HH" is the same as the probability that it is an "HT".

[2]: (A) the first occurrence of an "HH" is in a sequence of
any collection of "H" and "T" provided there is no "HH" in the
sequence, and the last is "H", followed by "H".
However, "HT" is allowed to occur in the sequence.

But (B) the first occurrence of an "HT" is in a sequence of
(zero or more "T") followed by (1 or more "H") followed by "T".
This is the only pattern in which "HT" does not occur prior to
the final "HT".
Similarly, "HH" is allowed to pccur in the sequence.

The reason that, in general, the probability of "HH" first occuring
at a given position is different from the probability if "HT" first
occurring at that position lies in the differences between the
number of possible sequences satisfying (A), and the number of
possible sequences satisfying (B).

The first few cases ("HH" or "HT" first occurring at (k+1), so
that the position of the first "H" in "HH" or "HT" is at k) are,
with their probabilities:

k=1:       HH     HT
          1/4    1/4

K=2:      THH     HHT
                  THT
          1/8     2/8

k=3:     TTHH     HHHT
         HTHH     THHT
                  TTHT
         2/16     3/16

k=4:    TTTHH     HHHHT
        THTHH     THHHT
        HTTHH     TTHHT
                  TTTHT
         3/32     4/32

The "HT" case is simple:
  P.HT[k] = Prob(1st "HT" at (k+1)) = k/(2^(k+1))
Exercise for the reader: Sum(P.HT) = 1

The "HH" case is more interesting. Experimental scribblings on
parer threw up an hypothesis, which I decided to explore in R.
Thanks to Gerrit Eichner for suggestion the use of expand.grid()!

  ## Function to count sequences giving 1st HH on throw k+1
  countHH <- function(k){
    M <- as.matrix(expand.grid(rep(list(0:1),k)))
    ix <- (M[,k]==1) ## k must be an H (then k+1 will be H)
    for(i in (1:(k-1))){ ix<-ix&( !((M[,i]==1)&(M[,i+1]==1)) ) }
    sum(ix)
    ## list(Count=sum(ix),Which=M[ix,])
  }

Now, ignoring the case k=1:

  HHcounts <- NULL
  for(i in (2:12)){ HHcounts<-c(HHcounts,countHH(i)) }
  rbind((3:13),HHcounts)

  #         [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
  #            3    4    5    6    7    8    9   10   11    12    13
  #HHcounts    1    2    3    5    8   13   21   34   55    89   144

Lo and Behold, we have a Fibonnaci sequence! Another exercise for
the reader ...

Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 01-Sep-09                                       Time: 10:38:58
------------------------------ XFMail ------------------------------