[R] Studdy Missing Data, differentiate between a percent with in the valid answers and with in the different missing answers

Mon Mar 3 10:02:17 CET 2008

On 3/3/08 8:21 PM, Ericka Lundström wrote:
 > I'm trying to emigrate from SPSS to R, thou I have some problems whit
 > getting R to distinguish between the different kind of missing.
...
 > Is there a smart way in R to differentiate between missing and valid
 > and at the same time treat both the categories within missing and
 > valid as answers (like SPSS did above)

The Hmisc package has some support for special missing values, for 
instance when reading in SAS datasets using sas.get. I don't believe 
spss.get offers the same facility, though.

You can define special missing values for a variable manually, which 
might seem a bit involved, but this could easily be automated. For your 
example, try:

special <- dataFrame$TWO %in% c("?","X")
attr(dataFrame$TWO, "special.miss") <-
     list(codes=as.character(dataFrame$TWO[special]),
     obs=(1:length(dataFrame$TWO))[special])
class(dataFrame$TWO) <- c("factor", "special.miss")
is.na(dataFrame$TWO) <- special

# Then describe gives new percentages

describe(dataFrame$TWO)
dataFrame$TWO
       n missing       ?       X  unique
       3       4       2       2       2

No (2, 67%), yes (1, 33%)

HTH,
James
-- 
James Reilly
Department of Statistics, University of Auckland
Private Bag 92019, Auckland, New Zealand