[R] Logistic regression problem: propensity score matching

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Jun 4 08:45:18 CEST 2003


1) Why are you using multinom when this is not a multinomial logistic 
regression?  You could just use a binomial glm.

2) The second argument to predict() is `newdata'.  `sample' is an R 
function, so what did you mean to have there?  I think the predictions 
should be a named vector if `sample' is a data frame.

3) There are many more examples of such things (and more explanation) in 
Venables & Ripley's MASS (the book).

On Wed, 4 Jun 2003, Paul Bivand wrote:

> I am doing one part of an evaluation of a mandatory welfare-to-work 
> programme in the UK.
> As with all evaluations, the problem is to determine what would have 
> happened if the initiative had not taken place.
> In our case, we have a number of pilot areas and no possibility of 
> random assignment.
> Therefore we have been given control areas.
> My problem is to select for survey individuals in the control areas who 
> match as closely as possible the randomly selected sample of action area 
> participants.
> As I understand the methodology, the procedure is to run a logistic 
> regression to determine the odds of a case being in the sample, across 
> both action and control areas, and then choose for control sample the 
> control area individual whose odds of being in the sample are closest to 
> an actual sample member.
> 
> So far, I have following the multinomial logistic regression example in 
> Fox's Companion to Applied Regression.
> Firstly, I would like to know if the predict() is producing odds ratios 
> (or probabilities) for being in the sample, which is what I am aiming 
> for. 

You asked for `probs', so you got probabilities.

> Secondly, how do I get rownames (my unique identifier) into the 
> output from predict() - my input may be faulty somehow and the wrong 
> rownames being picked up - as I need to export back to database to sort 
> and match in names, addresses and phone numbers for my selected samples.
> 
> My code is as follows:
> londonpsm <- sqlFetch(channel, "London_NW_london_pilots_elig", 
> rownames=ORCID)
> attach(londonpsm)
> mod.multinom <- multinom(sample ~ AGE + DISABLED + GENDER + ETHCODE + 
> NDYPTOT + NDLTUTOT + LOPTYPE)
> lonoutput <- predict(mod.multinom, sample, type='probs')
> london2 <- data.frame(lonoutput)
> 
> The Logistic regression seems to work, although summary() says the it is 
> not a matrix.

what is `it'?

> The output looks like odds ratios, but I would like to know whether this 
> is so.

No.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list