[R] fitting a lognormal distribution using cumulative probabilities

Prof Brian Ripley ripley at stats.ox.ac.uk
Fri Feb 22 18:35:29 CET 2008


On Sat, 23 Feb 2008, ahimsa campos-arceiz wrote:

> Dear all,
>
> I'm trying to estimate the parameters of a lognormal distribution fitted
> from some data.
>
> The tricky thing is that my data represent the time at which I recorded
> certain events. However, in many cases I don't really know when the event
> happened. I' only know the time at which I recorded it as already happened.

So this is a rather extreme form of censoring.

> Therefore I want to fit the lognormal from the cumulative distribution
> function (cdf) rather than from the probability distribution function (pdf).
>
> My understanding is that methods based on Maximum Likelihood (e.g. fitdistr
> {MASS}) are based on the pdf. Nonlinear least-squares methods seem to be
> based on the cdf... however I was unable to use nls{stat} for lognormal.

Not so: ML fitting can be done for censored data.  However, I don't think 
you have a valid description here: it seems you never recorded a time at 
which the event had not happened, and the most likely fit is a probability 
mass at zero (since this is a perfect explanation for your data).

To make any progress with censoring, you need to see both positive and 
negative events.  If you told us that none of these events happened before 
t=15, it would be possible to fit the model (although you would need far 
more data to get a good fit).

Generally code to handle censoring is in survival analysis: e.g. survreg() 
in package survival.  In the terminiology of the latter, all your 
observations are left-censored.

> I found a website that explains how to fit univariate distribution functions
> based on cumulative probabilities, including a lognormal example, in Matlab:
> http://www.mathworks.com/products/statistics/demos.html?file=/products/demos/shipping/stats/cdffitdemo.html
>
> and other programs like TableCurve 2D seem to do this too.

Maybe, but that is a different problem.  If you have an ECDF, the jumps 
give you the data so you can just use fitdistr().  (And you will see 
comparing observed and fitted CDFs in MASS, the book.)


> There must be a straightforward method in R which I have overlooked. Any
> suggestion on how can I estimate these parameters in R or helpful references
> are very much appreciated.
>
> (not sure if it helps but) here is an example of my type of data:
>
> treat.1 <- c(21.67, 21.67, 43.38, 35.50, 32.08, 32.08, 21.67, 21.67, 41.33,
>        41.33, 41.33, 32.08, 21.67, 22.48, 23.25, 30.00, 26.00, 19.37, 26.00
> ,
>        32.08, 21.67, 26.00, 26.00, 43.38, 26.00, 21.67, 22.48, 35.50, 38.30,
>
>        32.08)
>
> treat.2 <- c(35.92, 12.08, 12.08, 30.00, 33.cy73, 35.92, 12.08, 30.00, 
56.00,
>        30.00, 35.92, 33.73, 12.08, 26.00, 54.00, 12.08, 12.08, 35.92, 35.92
> ,
>        12.08, 33.73, 35.92, 63.20, 30.00, 26.00, 33.73, 23.50, 30.00, 35.92
> ,
>        30.00)
>
> Thank you very much!
>
> Ahimsa
>
>
> --
> ahimsa campos-arceiz
> www.camposarceiz.com
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list