[R] Ellipse that Contains 95% of the Observed Data

Jim Lemon jim at bitwrit.com.au
Mon Mar 29 11:20:36 CEST 2010


On 03/29/2010 07:17 PM, Barry Rowlingson wrote:
> ...
>   I think the problem as posed doesn't produce a unique ellipse. You
> could start with a circle of radius 0 centered on mean(x),mean(y) and
> then increase the radius until it has 95% of the points in it. As long
> as your points are in continuous space and with no coincident points
> then you could do a simple bisection search on the radius.
>
>   Similarly you could start with an ellipse of any eccentricity
> centered at the same point with fixed angle and do the same. And the
> ellipse doesn't even need to be centered at the mean point - it could
> be waaay over to the left and eventually as it gets bigger it will
> gobble up 95% of the points.
>
>   Obviously with bivariate normally-distributed points we tend to show
> the ellipse that is numerically derived from the mean and correlation
> of the two normals, but that's not the only ellipse that takes 95% of
> the points.
>
>   So ummm I'm not sure what you should do. What is the question you are
> trying to answer?

So, why not begin with a problem that is uniquely soluble and achieve 
the viewpoint of its solution?

1) If we assume that the distribution has a barycenter, then that can be 
calculated.

2) Calculate the distances of all points from the barycenter, flagging 
the 5% most distant.

3) Divide the area covered by the points into an arbitrary number of 
equal sectors, say 10.

4) Within each sector, find the most distant "inner" point and the least 
distant "outer" point, placing a dot in the middle of the sector at 5% 
of their difference beyond the radius of the most distant "inner" point.

5) Join the dots.

6) Look at it.

Jim



More information about the R-help mailing list