[R] adding variable

Tue Nov 18 22:53:42 CET 2003

On Tue, 18 Nov 2003, Martin Wegmann wrote:

> Ok I try to explain it clearer. 
> 
> I am not looking for step() add1() drop1() or similar commands. Nothing to do 
> with variable selection.
> 
> I have two data frames, on with environmental variables and another one with 
> animal data (let's say absence/presence of 10 species)
> 
> first I look which env. variables explain the presence of species 1
> 
> glm(species1~env.var1+env.var2+.....) -> glm.spec1
> 
> step(glm.spec1) -> glm.spec1.step
> 
> I get certain env. variables which have the biggest explanatory power.
> 
> Now I would like to treat the other absence/presence data of my species like 
> env. variables which could influence the presence of species1
> I included the env.variable from glm.spec1.step (I call them env.varX+...)
> 
> glm(species1~env.varX+......+species2) -> glm.species1.sp2
> 
> 
> glm(species1~env.varX+......+species3) -> glm.species1.sp3
> 
> 
> and this procedure shall be done for all remaining species.
> 
> I am looking for a method to add automatically each species2 up to species10 
> and run glm(). 

That is what add1 does.

> The first part with the env. variables shall be kept as it is but the last 
> variable (speciesX) shall be changed each time. I am looking for something 
> like a placeholder and the command greps a different species from the species 
> dataframe for each run and add it instead of the place holder.
> 
> I hope I explained it better. thanks Martin
> 
> On Tuesday 18 November 2003 22:14, Prof Brian Ripley wrote:
> > Are you looking for something like add1 then?
> >
> > We do need a much clearer explanation of what you are trying to do to be
> > able to help you: and not with y used in two separate senses!
> >
> > On Tue, 18 Nov 2003, Martin Wegmann wrote:
> > > On Tuesday 18 November 2003 19:32, Prof Brian Ripley wrote:
> > > > On Tue, 18 Nov 2003, Martin Wegmann wrote:
> > > > > I have count data of animals (here y, y1, y2...) and env. variables
> > > > > (x, x1, x2 ,....).
> > > > >
> > > > > I used a glm
> > > > >
> > > > > glm(y~x1+x2+x3....)
> > > > >
> > > > > glm(y1~x1+x2+x3....)
> > > > >
> > > > > and now I would like to add the count data of other species to
> > > > > investigate if they might have a bigger impact than the env.
> > > > > variables:
> > > > >
> > > > > #x? are the selected var from the first glm run
> > > > >
> > > > > glm(y~x?+x?+y1)
> > > > >
> > > > > glm(y~x?+x?+x?+y2)
> > > > >
> > > > > ....
> > > > >
> > > > > I wonder if there is a more elegant method to do this than adding
> > > > > (and removing) each y by hand.
> > > >
> > > > Do you mean each x?  In either case, see ?update.
> > >
> > > update looks good but with update and with adding the y I have to do it
> > > manually.
> > >
> > > I thought something like doing
> > >
> > > glm(y~x+x1+x2+....+y§)
> > >
> > > where y§ is: grep y1 out of df.y run glm and name it
> > > grep y2 out of df.y run glm .....
> > >
> > > until all y's of df.y has been onced included in the model.
> > > every time only one y§ has to be included
> > >
> > > the included x's have to be kept. I only want to look if one species
> > > variables has more explanation power than the env. variables.
> > >
> > > perhaps this helps to understand what I am looking for:
> > > I think bash scripts are not possible in R but it would look like such a
> > > bash script for GRASS:
> > >
> > > for variable in y1 y2 y3  ....; do
> > >
> > > glm(y~x+x1+x2....+$variable)->glm.$variable
> > > ; done
> > >
> > > #where $variable refers to the name of read in y's.
> > >
> > >
> > > Martin
> 
> 
> 

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595