[R] replace NA's with row means for specific columns
Zahra
captiva24 at yahoo.com
Mon Nov 2 20:49:01 CET 2015
Hi there,
I am looking for some help replacing missing values in R with the row mean. This is survey data and I am trying to impute values for missing variables in each set of questions separately using the mean of the scores for the other questions within that set.
I have a dataset that looks like this
ID A1 A2 A3 B1 B2 B3 C1 C2 C3 C4
b 4 5 NA 2 NA 4 5 1 3 NA
c 4 5 1 NA 3 4 5 1 3 2
d NA 5 1 1 NA 4 5 1 3 2
e 4 5 4 5 NA 4 5 1 3 2
I want to replace any NA's in columns A1:A3 with the row mean for those columns only. So for ID=b, I want the NA in A3[ID=b] to be (4+5)/2 which is the average of the values in A1 and A2 for that row.
Same thing for columns B1:B3 - I want the NA in B2[ID=b] to be the mean of the values of B1 and B3 in row ID=b so that B2[ID=b] becomes 3 which is (2+4)/2. And same in C1:C4, I want C4[ID=b] to become (5+1+3)/3 which is the mean of C1:C3.
Then I want to go to row ID=c and do the same thing and so on.
Can anybody help me do this? I have tried using rowMeans and subsetting but can't figure out the right code to do it.
Thanks so much.
Zahra
More information about the R-help
mailing list