[R] Density Estimation

Adelchi Azzalini aa at tango.stat.unipd.it
Thu Jun 8 21:30:31 CEST 2006


On Thu, Jun 08, 2006 at 08:31:26PM +0200, Pedro Ramirez wrote:
> >In mathematical terms the optimal bandwith for density estimation
> >decreases at rate n^{-1/5}, while the one for distribution function
> >decreases at rate n^{-1/3}, if n is the sample size. In practical terms,
> >one must choose an appreciably smaller bandwidth in the second case
> >than in the first one.
> 
> Thanks a lot for your remark! I was not aware of the fact that the
> optimal bandwidths for density and distribution do not decrease
> at the same rate.
> 
> >Besides the computational aspect, there is a statistical one:
> >the optimal choice of bandwidth for estimating the density function
> >is not optimal (and possibly not even jsut sensible) for estimating
> >the distribution function, and the stated problem is equivalent to
> >estimation of the distribution function.
> 
> The given interval "0<x<3" was only an example, in fact I would
> like to estimate the probability for intervals such as
> 
> "0<=x<1" , "1<=x<2" , "2<=x<3" , "3<=x<4" , ....
> 
> and compare it with the estimates of a corresponding histogram.
> In this case the stated problem is not anymore equivalent to the
> estimation of the distribution function. What do you think, can

why not? the probabilities you are interested in are of the form

F(1)-F(0), F(2)-F(1), and so on

where F(.) if the cumulative distribution function (and it must
be continuous, since its derivative exists).

> I go a ahead in this case with the optimal bandwidth for the
> density? Thanks a lot for your help!

no

best wishes,

Adelchi

> Best wishes
> Pedro
> 
> 
> 
> 
> >best wishes,
> >
> >Adelchi
> >
> >
> >PR>
> >PR> >
> >PR> >--
> >PR> >Gregory (Greg) L. Snow Ph.D.
> >PR> >Statistical Data Center
> >PR> >Intermountain Healthcare
> >PR> >greg.snow at intermountainmail.org
> >PR> >(801) 408-8111
> >PR> >
> >PR> >
> >PR> >-----Original Message-----
> >PR> >From: r-help-bounces at stat.math.ethz.ch
> >PR> >[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Pedro
> >PR> >Ramirez Sent: Wednesday, June 07, 2006 11:00 AM
> >PR> >To: r-help at stat.math.ethz.ch
> >PR> >Subject: [R] Density Estimation
> >PR> >
> >PR> >Dear R-list,
> >PR> >
> >PR> >I have made a simple kernel density estimation by
> >PR> >
> >PR> >x <- c(2,1,3,2,3,0,4,5,10,11,12,11,10)
> >PR> >kde <- density(x,n=100)
> >PR> >
> >PR> >Now I would like to know the estimated probability that a new
> >PR> >observation falls into the interval 0<x<3.
> >PR> >
> >PR> >How can I integrate over the corresponding interval?
> >PR> >In several R-packages for kernel density estimation I did not
> >PR> >found a corresponding function. I could apply Simpson's Rule for
> >PR> >integrating, but perhaps somebody knows a better solution.
> >PR> >
> >PR> >Thanks a lot for help!
> >PR> >
> >PR> >Pedro
> >PR> >
> >PR> >_________
> >PR> >
> >PR> >______________________________________________
> >PR> >R-help at stat.math.ethz.ch mailing list
> >PR> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PR> >PLEASE do read the posting guide!
> >PR> >http://www.R-project.org/posting-guide.html
> >PR> >
> >PR>
> >PR> ______________________________________________
> >PR> R-help at stat.math.ethz.ch mailing list
> >PR> https://stat.ethz.ch/mailman/listinfo/r-help
> >PR> PLEASE do read the posting guide!
> >PR> http://www.R-project.org/posting-guide.html
> >PR>
> 
> _________________________________________________________________
> Don't just search. Find. Check out the new MSN Search! 
> http://search.msn.com/

-- 
Adelchi Azzalini  <azzalini at stat.unipd.it>
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/



More information about the R-help mailing list