[R] better way of recoding factors in data frame?
mohinder_datta at yahoo.com
mohinder_datta at yahoo.com
Thu Apr 9 15:48:57 CEST 2009
Hi all,
I apologize in advance for the length of this post, but I wanted to make sure I was clear.
I am trying to merge two dataframes that share a number of rows (but some are unique to each data frame). Each row represents a subject in a study. The problem is that sex is coded differently in the two, including the way missing values are represented.
Here is an example of the merged dataframe:
> myFrame2
SubjCode SubjSex Sex
1 sub1 M <NA>
2 sub2 F <NA>
3 sub3 M Male
4 sub4 M <NA>
5 sub5 F <NA>
6 sub6 F Female
7 sub7 <NA>
8 sub8 <NA>
9 sub9 Not Recorded
10 sub10 Not Recorded
I then apply the following:
> myFrame2$SubjSex <- factor(myFrame2$SubjSex, levels = c('M','F'))
> myFrame2$SubjSex <- factor(myFrame2$SubjSex, labels = c('Male','Female'))
> myFrame2 <- transform(myFrame2, newSex = ifelse(is.na(SubjSex), Sex, SubjSex))
...and get this:
> myFrame2
SubjCode SubjSex Sex newSex
1 sub1 Male <NA> 1
2 sub2 Female <NA> 2
3 sub3 Male Male 1
4 sub4 Male <NA> 1
5 sub5 Female <NA> 2
6 sub6 Female Female 2
7 sub7 <NA> <NA> NA
8 sub8 <NA> <NA> NA
9 sub9 <NA> Not Recorded 3
10 sub10 <NA> Not Recorded 3
I need that last column to have just 1 (Male), 2 (Female) or 0 (Missing), and the only way I've come up with seems very kludgy:
> myFrame2$newSex[is.na(myFrame2$newSex)] <- 0
> myFrame2$newSex <- ifelse(myFrame2$newSex == 3, 0, myFrame2$newSex)
That gives me the right values for "newSex", but I'd like to positively select for the values I want to keep, rather than negatively selecting the ones to change - I tried this:
> myFrame2$newSex <- ifelse(myFrame2$newSex ==1 || myFrame2$newSex == 2, myFrame2$newSex, 0)
But I just get 1 for every row in newSex. Does anyone know of a way to do this by positively selecting the values 1 and 2?
Thanks,
Mohinder
More information about the R-help
mailing list