[R] histogram scott
Duncan Murdoch
murdoch at stats.uwo.ca
Fri Feb 5 19:29:15 CET 2010
On 05/02/2010 12:21 PM, maram salem wrote:
> Dear all,
> I want to use the histogtam as a density estimator, with the binwidths calculated using scott's formula which is
> binwidth = 3.49*ST.dev.*n^(-1/3)
> for the following data (30 data points)
> 12-9-3-6-1-23-21-7-18-16-15-4-19-22-20-2-3-18-8-10-1-7-5-4-11-12-3-9-19-7
> so first,I' ve tried this manually, and substituted in the above formula and I got
> st.dev.=7.02745
> and thus the binwidth=7.89313
>
> But when I used hist with breaks = "scott", that is
> h<-hist(x,breaks="scott")
> I got the breaks in the histogram object = 0 10 20 30
> that is, the binwidth used is equal to 10 not 7.89313??
> I don't know why?
> shouldn't they be exactly the same??
No, R prefers to put breaks on round numbers. It uses the Scott or
other rule to work out approximately how many there should be, then
picks nice round numbers that come close. If you want the bins on
particular exact locations, then you need to give the breaks
explicitly. As the documentation for the "breaks" argument says,
"In the last three cases the number is a suggestion only."
Duncan Murdoch
More information about the R-help
mailing list