[R] Antwort: Fw: Re: Subscripting problem with is.na()
G.Maubach at weinwolf.de
G.Maubach at weinwolf.de
Mon Jun 27 10:45:12 CEST 2016
Hi David,
Hi Bert,
many thanks for the valuable discussion on NA in R (please see extract
below). I follow your arguments leaving NA as they are for most of the
time. In special occasions however I want to replace the NA with another
value. To preserve the newly acquired knowledge for me I wrote this
function:
-- cut --
t_replace_na <- function(dataset, variable, value) {
if(inherits(dataset[[variable]], "factor") == TRUE) {
dataset[variable] <- as.character(dataset[variable])
print(class(dataset[variable]))
dataset[, variable][is.na(dataset[, variable])] <- value
dataset[variable] <- as.factor(dataset[variable])
print(class(dataset[variable]))
} else {
dataset[, variable][is.na(dataset[, variable])] <- value
}
return(dataset)
}
ds_test <- data.frame(a=c(1,NA,2), b = rep(NA,3), c = c("A","b",NA))
print(sapply(ds_test, class))
t_replace_na(ds_test, "a", value = -1)
t_replace_na(ds_test, "b", value = -2)
t_replace_na(ds_test, "c", value = -3)
-- cut --
Unfortunately the if-statement does not work due to a wrong class
definition within the function. When finding out what is going on I did
this:
-- cut --
test_class <- function(dataset, variable) {
if(inherits(dataset[, variable], "factor") == TRUE) {
return(c(class(dataset[variable]), TRUE))
} else {
return(c(class(dataset[variable]), FALSE))
}
}
ds_test <- data.frame(a=c(1,NA,2), b = rep(NA,3), c = c("A","b",NA))
print(sapply(ds_test, class))
# -- Test a --
class(ds_test[, "a"])
if(inherits(ds_test[, "a"], "factor")) {
print(c(class(ds_test[, "a"]), "TRUE"))
} else {
print(c(class(ds_test[, "a"]), "FALSE"))
}
test_class(ds_test, "a")
warning("'a' should be numeric NOT data.frame!")
# -- Test b --
if(inherits(ds_test[, "b"], "factor")) {
print(c(class(ds_test[, "b"]), "TRUE"))
} else {
print(c(class(ds_test[, "b"]), "FALSE"))
}
class(ds_test[, "b"])
test_class(ds_test, "b")
warning("'b' should be logical NOT data.frame!")
# -- Test c --
if(inherits(ds_test[, "c"], "factor")) {
print(c(class(ds_test[, "c"]), "TRUE"))
} else {
print(c(class(ds_test[, "c"]), "FALSE"))
}
class(ds_test[, "c"])
test_class(ds_test, "c")
warning("'c' should be factor NOT data.frame.
In addition data.frame != factor")
-- cut --
Why do I get different results for the same function if it is inside or
outside my own function definition?
Kind regards
Georg
--------------------------------
> Gesendet: Donnerstag, 23. Juni 2016 um 21:14 Uhr
> Von: "David L Carlson" <dcarlson at tamu.edu>
> An: "Bert Gunter" <bgunter.4567 at gmail.com>
> Cc: "R Help" <r-help at r-project.org>
> Betreff: Re: [R] Subscripting problem with is.na()
>
> Good point. I did not think about factors. Also your example raises
another issue since column c is logical, but gets silently converted to
numeric. This would seem to get the job done assuming the conversion is
intended for numeric columns only:
>
> > test <- data.frame(a=c(1,NA,2), b = c("A","b",NA), c= rep(NA,3))
> > sapply(test, class)
> a b c
> "numeric" "factor" "logical"
> > num <- sapply(test, is.numeric)
> > test[, num][is.na(test[, num])] <- 0
> > test
> a b c
> 1 1 A NA
> 2 0 b NA
> 3 2 <NA> NA
>
> David C
More information about the R-help
mailing list