[R] Studdy Missing Data, differentiate between a percent with in the valid answers and with in the different missing answers
James Reilly
reilly at stat.auckland.ac.nz
Mon Mar 3 10:02:17 CET 2008
On 3/3/08 8:21 PM, Ericka Lundström wrote:
> I'm trying to emigrate from SPSS to R, thou I have some problems whit
> getting R to distinguish between the different kind of missing.
...
> Is there a smart way in R to differentiate between missing and valid
> and at the same time treat both the categories within missing and
> valid as answers (like SPSS did above)
The Hmisc package has some support for special missing values, for
instance when reading in SAS datasets using sas.get. I don't believe
spss.get offers the same facility, though.
You can define special missing values for a variable manually, which
might seem a bit involved, but this could easily be automated. For your
example, try:
special <- dataFrame$TWO %in% c("?","X")
attr(dataFrame$TWO, "special.miss") <-
list(codes=as.character(dataFrame$TWO[special]),
obs=(1:length(dataFrame$TWO))[special])
class(dataFrame$TWO) <- c("factor", "special.miss")
is.na(dataFrame$TWO) <- special
# Then describe gives new percentages
describe(dataFrame$TWO)
dataFrame$TWO
n missing ? X unique
3 4 2 2 2
No (2, 67%), yes (1, 33%)
HTH,
James
--
James Reilly
Department of Statistics, University of Auckland
Private Bag 92019, Auckland, New Zealand
More information about the R-help
mailing list