[R] paste with apply, spaces and NA

Sarah Goslee sarah.goslee at gmail.com
Fri May 8 00:00:03 CEST 2009


Hello everyone,

I've come up with a problem with using paste() inside apply() that I
can't seem to solve.
Briefly, if I'm using paste to collapse the rows of a data frame, AND
the data frame
contains strings with spaces, AND there are NA values in subsequent
columns, then
paste() introduces spaces. This only happens with that particular combination of
data values and commands. I have a workaround - replacing NA with "NA" - but
this seems odd.

Thanks for any thoughts,
Sarah


R --vanilla
# R version 2.9.0 (2009-04-17)
# Fedora Core 10

> test1 <- data.frame(A = rep(1, 5), B = rep("a", 5), C = rep("a b", 5), D = rep(2, 5), stringsAsFactors=FALSE)
>
> # has an NA value in a column before the column containing strings with spaces
> test2 <- test1
> test2$B[4] <- NA
>
> # has an NA value in a column after the column containing strings with spaces
> test3 <- test1
> test3$D[4] <- NA

> str(test1)
'data.frame':	5 obs. of  4 variables:
 $ A: num  1 1 1 1 1
 $ B: chr  "a" "a" "a" "a" ...
 $ C: chr  "a b" "a b" "a b" "a b" ...
 $ D: num  2 2 2 2 2
> str(test2)
'data.frame':	5 obs. of  4 variables:
 $ A: num  1 1 1 1 1
 $ B: chr  "a" "a" "a" NA ...
 $ C: chr  "a b" "a b" "a b" "a b" ...
 $ D: num  2 2 2 2 2
> str(test3)
'data.frame':	5 obs. of  4 variables:
 $ A: num  1 1 1 1 1
 $ B: chr  "a" "a" "a" "a" ...
 $ C: chr  "a b" "a b" "a b" "a b" ...
 $ D: num  2 2 2 NA 2

> # works as expected
> apply(test1, 1, paste, collapse=",")
[1] "1,a,a b,2" "1,a,a b,2" "1,a,a b,2" "1,a,a b,2" "1,a,a b,2"

> # works as expected
> # does NOT add spaces to the column with the NA value
> apply(test2, 1, paste, collapse=",")
[1] "1,a,a b,2"  "1,a,a b,2"  "1,a,a b,2"  "1,NA,a b,2" "1,a,a b,2"

> # introduces spaces in the column with the NA value
> # only if that column is after a column that contains strings with spaces
> apply(test3, 1, paste, collapse=",")
[1] "1,a,a b, 2" "1,a,a b, 2" "1,a,a b, 2" "1,a,a b,NA" "1,a,a b, 2"

> # pasting the columns together manually works as expected
> paste(test3$A, test3$B, test3$C, test3$D, sep=",")
[1] "1,a,a b,2"  "1,a,a b,2"  "1,a,a b,2"  "1,a,a b,NA" "1,a,a b,2"

> # pasting a single row works as expected
> paste(test3[3,], collapse=",")
[1] "1,a,a b,2"

## workaround
> test3[is.na(test3)] <- "NA"
> apply(test3, 1, paste, sep="", collapse=",")
[1] "1,a,a b,2"  "1,a,a b,2"  "1,a,a b,2"  "1,a,a b,NA" "1,a,a b,2"



-- 
Sarah Goslee
http://www.functionaldiversity.org




More information about the R-help mailing list