[R] Offtopic, HT vs. HH in coin flips

Mon Aug 31 22:35:11 CEST 2009

Part of my issue was that I was not answering my original question.  "What is more likely to show up first, HT or HH?" The answer to that turns out to be "neither", or "identical chances". 

ht <- replicate(2500,
                paste(sample(c("H", "T"), 100, replace = TRUE),
                      collapse = ""))

hts <- regexpr("HT", ht) + 1
hhs <- regexpr("HH", ht) + 1

## which is first?
table(hts < hhs)  # about 50/50 

summary(hts)      #mean of 4
summary(hhs)      #mean of 6

So, "What is more likely to show up first, HH or HT?" is of course a different question than "Are the expected values of the positions for the first HT or HH the same?"  I suppose that's where confusion set in.  It seems that if HH appears later in the string on average (i.e., after 6 tosses instead of 4), that the probability of it being first would be lower than HT, which is obviously wrong!

A quick graphic that helps show this (you must run the above code first):

library(lattice)

ht.df <- data.frame(count = c(hts, hhs),
                    type = gl(2, 1250, labels = c("HT", "HH")))

barchart(prop.table(xtabs(~ count + type, data = ht.df)),
         stack = FALSE, horizontal = FALSE,
         box.ratio = .8, auto.key = TRUE)

Thanks to all those who replied, and also someone sent me the following link off list, it also clears up the confusion:

http://www.mit.edu/~emin/writings/coinGame.html

Best, 
Erik 

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Erik Iverson
Sent: Monday, August 31, 2009 2:17 PM
To: r-help at r-project.org
Subject: [R] Offtopic, HT vs. HH in coin flips

Dear R-help, 

Could someone please try to explain this paradox to me? What is more likely to show up first in a string of coin tosses, "Heads then Tails", or "Heads then Heads"?  

##generate 2500 strings of random coin flips
ht <- replicate(2500,
                paste(sample(c("H", "T"), 100, replace = TRUE),
                      collapse = ""))

## find first occurrence of HT
mean(regexpr("HT", ht))+1    #mean of HT position, 4

## find first occurrence of HH
mean(regexpr("HH", ht))+1    #mean of HH position, 6

FYI, this is not homework, I have not been in school in years.  I saw a similar problem posed in a blog post on the Revolutions R blog, and although I believe the answer, I'm having a hard time figuring out why this should be? 

Thanks,
Erik Iverson

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.