[R] hist function: freq=FALSE for standardised histograms
Marco Geraci
marcodoc75 at yahoo.com
Wed Apr 5 22:16:15 CEST 2006
Hi,
how did you evaluate the total area?
Here is a simple example
###
set.seed(100)
x <- rnorm(100)
x.h <- hist(x, freq=F, plot=F)
> x.h
$breaks
[1] -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0
2.5 3.0
$counts
[1] 3 4 9 14 22 20 13 7 5 2 1
$intensities
[1] 0.05999999 0.08000000 0.18000000 0.28000000
0.44000000 0.40000000
[7] 0.26000000 0.14000000 0.10000000 0.04000000
0.02000000
$density
[1] 0.05999999 0.08000000 0.18000000 0.28000000
0.44000000 0.40000000
[7] 0.26000000 0.14000000 0.10000000 0.04000000
0.02000000
$mids
[1] -2.25 -1.75 -1.25 -0.75 -0.25 0.25 0.75 1.25
1.75 2.25 2.75
$xname
[1] "x"
$equidist
[1] TRUE
attr(,"class")
[1] "histogram"
> sum(diff(x.h$breaks)*x.h$density)
[1] 1
# Also, you can verify
> diff(x.h$breaks)*x.h$density*100
[1] 2.999999 4.000000 9.000000 14.000000 22.000000
20.000000 13.000000
[8] 7.000000 5.000000 2.000000 1.000000
HTH
Marco
--- Alex Davies <alex at davz.net> wrote:
> Dear All,
>
> I am a undergraduate using R for the first time. It
> seems like an excellent
> program and one that I look forward to using a lot
> over the next few years,
> but I have hit a very basic problem that I can't
> solve.
>
> I want to produce a standardised histogram, i.e. one
> where the area under
> the graph is equal to 1. I look at the manual for
> the histogram function and
> find this:
>
> freq: logical; if 'TRUE', the histogram graphic
> is a representation
> of frequencies, the 'counts' component of
> the result; if
> 'FALSE', probability densities, component
> 'density', are
> plotted (so that the histogram has a total
> area of one).
> Defaults to 'TRUE' _iff_ 'breaks' are
> equidistant (and
> 'probability' is not specified).
>
> I therefore expect that the following command:
>
> > h <- hist(StockReturns, freq=FALSE)
>
> where StockReturns has the following data in it:
>
> > sourcedata$StockReturns
> [1] -0.006983 0.111565 0.053782 0.027966
> 0.068956 0.165424 -0.022133
> [8] -0.001910 0.052174 0.072589 -0.023002
> 0.000521 -0.015688 0.148459
> [15] 0.054111 0.141044 0.096686 -0.012256
> -0.030397 0.039365 0.021407
> [22] -0.175750 0.053901 -0.095730 0.129717
> 0.333333 0.061563 0.085052
> [29] 0.072295 -0.008500 0.100000 0.020000
> -0.199763 0.081856 0.013636
> [36] 0.007812 0.038647 -0.026945 0.037965
> -0.079889 0.056234 -0.083333
> [43] -0.012792 0.131711 0.015996 0.008149
> 0.104568 0.004046 -0.027750
> [50] 0.050802 0.045714 0.092327 -0.017857
> 0.022574 0.083333 0.051366
> [57] 0.004215 0.083228 0.046803 0.021335
> 0.023797 0.094891 0.036541
> [64] 0.016423 -0.126365 0.034219 0.098330
> 0.079292 -0.009901 0.021559
> [71] -0.039414 0.114286 0.101856 -0.010452
> 0.111111 0.097274 0.104843
> [78] 0.144439 0.021868 0.106667 0.081250
> 0.002097 0.073302 0.087889
> [85] -0.145165 0.014592 0.035000 0.131711
> -0.126937 0.133989
>
> would result in a graph that has an area of equal to
> 1.000. However, it does
> not - it produces frequency densities not
> standardized frequency densities.
> Can someone point me in the right direction here - I
> know I am being
> fantastically thick but can't find out how to do
> such a simple operation!
>
> My complete set of commands looks like this:
>
> > sourcedata <- read.table("c:/data.dat",header=T)
> > attach(sourcedata)
> > h <- hist(StockReturns, col='red', labels=TRUE,
> ylab="Frequency Density",
> probability=TRUE)
>
> Where c:\data.dat is a file with the numbers above
> it, one per line, and the
> first line containing the string "StockReturns".
>
> Many thanks,
>
> Alex Davies
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>
More information about the R-help
mailing list