[R] quantile regression problem

(Ted Harding) Ted.Harding at nessie.mcc.ac.uk
Sun Dec 11 00:10:18 CET 2005

On 10-Dec-05 zuzmun at natur.cuni.cz wrote:
> Dear List members,
> I would like to ask for advise on quantile regression in R.
> I am trying to perform an analysis of a relationship between
> species abundance and its habitat requirements -
> the habitat requirements are, however, codes - 0,1,2,3... where 0<1<2<3
> and the scale is linear - so I would be happy to treat them as
> continuos

As well as Roger Koenker's comments, you may also wish to consider
the following.

(By the way, despite what you say above, you have "codes" at
values 0, 0.5, 1, 1.5, 2. 3 -- is there anything special about
the 0.5 and 1.5, or are they on the same footing as 0, 1, 2, 3?
Also, I am curious as to why "habitat requirement" is named
"absdeviation" in the data file. What does "habitat requirement"

> The analysis of the data somehow does not work, I am trying to
> perform linear quantile regression using rq function and I cannot
> figure out whether there is a way to analyse the data using quantile
> regression (I would really like to do this since the shape is an
> envelope) or whether it is not possible.

As Roger noted, the distribution of data is very variable over
the values of "absdeviation":

absdeviation:       0      0.5    1      1.5    2      3 
Number of data:   673     15    493      3     19     20 
Total data: 1223

Therefore you chiefly have information about the cases "0" and "1".

I have loked at the data the opposite way round from you: For each
value of "absdeviation" ("H" for "habitat in the following),
consider the values of "abundance" (A).

For H=0 and H=1, the values of A are quite well approximated by
a negative exponential distribution, thought the fit is better
for H=1 than for H=0 -- in a more careful examination, I would try
to emulate a for the continuous variable A a distribution inspired
by the logarithmic distribution p(n) = (t^n)/(n*log(1-t)), n=0,1,2...
which is a classic distribution for the probability that a species
will be represented by n individuals in a sample of a large number
of species whose different abundances are variable
(Fisher, Corbett and Williams, and much later work).

The mean A for H=0 is m0 = 0.09389265 (n0=673), and
the mean A for H=1 is m1 = 0.08407791 (n1=493).

with respective atandard deviations

  s0 = 0.1262238
  s1 = 0.08952975

on the basis of which

  (m0-m1)/(sqrt((s0^2)/n0 + (s1^2)/n1)) = 1.553156

which is not particularly large. While the histograms




do somewhat indicate a tendency for higher values of A to
occur when H=0 than when H=1 there are only a few of these.

So on a first look, I am induced to conclude that there is
little evidence in the two dominant data groups (H=0 and H=1)
to indicate that these two groups differ. I doubt that the
information for the H=0.5, H=1.5, H=2 anf H=3 would have
more than a slight effect on this (though I have not looked
on detail).

The corresponding means, however, are

  m0.5 = 0.1273273    (n = 15)
  m1.5 = 0.03003003   (n =  3)
  m2   = 0.02908183   (n = 19)
  m3   = 0.03830066   (n = 20)

which at first sight does suggest that, while m0.5 is similar
to m0 and m1 above, m1.5 and m2 and m3 are distinctly smaller.
However, for m1.5 this is based on a very small sample, and
in any case the distribution of the raw values of A is so skew
that the larger values of A occurring for H=0 and H=1 are unlikely
to occur in such small samples.

Therefore, preliminary conclusion: I cannot see strong evidence
of a relationship between "absdeviation" and abundance.

Hoping this is useful,
Best wishes,

> I tested that if I replace the categories with continuous
> data of the same range it works perfectly. In the form I have
> them (and I cannot change it) I am getting errors - mainly
> about non-positive fis.
> Could somebody please let me know whether there was a way to
> analyse the  data?
> The data are enclosed and the question is
> Is there a relationship between abundance and absdeviation?
> I am interested in the upperlimit so I wanted to analyze the upper 5%.
> Thanks a lot for your help
> All the best
> Zuzana Munzbergova
> www.natur.cuni.cz/~zuzmun

E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 10-Dec-05                                       Time: 23:10:15
------------------------------ XFMail ------------------------------

More information about the R-help mailing list