[R] histograms

D.A.Wooff@durham.ac.uk D.A.Wooff at durham.ac.uk
Tue Jun 8 12:14:44 CEST 1999

> >>>>> "PD" == Peter Dalgaard BSA <p.dalgaard at biostat.ku.dk> writes:
>     PD> "Venables, Bill (CMIS, Cleveland)" <Bill.Venables at cmis.CSIRO.AU>
>     PD> writes:
>     >> The fact that every elementary book on statistics does it this way
>     >> does not make it correct.  To be helpful, a histogram really has to
>     >> be a non-parametric density estimator, period.
>     >> 
>     >> Enough already of polemics.
>     PD> Not quite! There is a reason for doing it the other way, namely
>     PD> that the concept of a histogram generally comes before the concept
>     PD> of a probability density, pedagogically. It is very easy to explain
>     PD> that you chop up the axis into bins and count the number of data
>     PD> points that fall in each of them. I bet that half of the MDs that I
>     PD> teach never quite understand the density (hell, the author of the
>     PD> textbook I use managed to plot three identical gaussian curves with
>     PD> identical y axis but different x axes... and he's a
>     PD> statistician). So for the basic uses of the histogram, one would be
>     PD> replacing a perfectly intuitive simple unit with a substantially
>     PD> more complex one.
> I agree 100% with Peter.  
> Being a mathematician I agree with Bill that for us, a histogram is a
> (very suboptimal) density estimate;  but average statistics software users
> *do* learn histograms differently..  

I hope there are many of us that agree 100% with Bill. Bad practice,
as enshrined in the default behaviour of histogram, should be
discouraged.  We should aim to introduce density-based histograms from
the outset, and the default behaviour of histograms in many packages
acts against this principle. The current default behaviour conveys a
misleading and arguably useless summary, and I don't go with the
argument that we should persist with it because it is simple to
understand where the numbers come from.



  David Wooff, Director, Statistics and Mathematics Consultancy Unit,
  Department of Mathematical Sciences, University of Durham.
  Science Laboratories, South Road, Durham, DH1 3LE, UK.
  Tel. 0191 374 4531, Fax 0191 374 7388.

r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list