[R] MApply and SubStr

David Winsemius dwinsemius at comcast.net
Wed Feb 11 02:03:24 CET 2015


On Feb 10, 2015, at 3:58 PM, Brian Trautman wrote:

> Hi!
> 
> I'm trying to write a custom function that applies SubStr to a string, and
> then depending on the arguments, converts the output to a number.
> 
> The substring part of my code works fine, but it's not converting the way I
> want to --
> 
> options('stringsAsFactors'=FALSE)
> require(data.table)
> 
> substr_typeswitch <- function(x, start, stop, typeto='chr')
> {
>  tmpvar <- substr(x=x, start=start, stop=stop)
>  tmpvar <- switch(typeto, num=as.numeric(tmpvar), tmpvar)
>  return(tmpvar)
> }
>  startpos <- c(01, 03)
>  endpos <-   c(02, 04)
>  typelist <- c('chr', 'num')
> 
>  startdata <- as.data.table(c('aa01', 'bb02'))
> 
>  enddata_want <- as.data.table(mapply(substr_typeswitch, startdata,
> startpos, endpos, typelist))
> 
> If I examine enddata_want --
> 
>> str(enddata_want)
> Classes ‘data.table’ and 'data.frame': 2 obs. of  2 variables:
> $ V1: chr  "aa" "bb"
> $ NA: chr  "1" "2"
> - attr(*, ".internal.selfref")=<externalptr>
> 
> "1" and "2" are being stored as character, and not as number.

It appears from you code that you might be expecting a vector in a dataframe object to have a character mode in the first postition and a numeric mode in the second position. That wouldn't seem to be a reasonable expectation. But maybe you were hoping the chr and num types were to be applied to columns. I was surprised to get something different from as.data.table:

> str(enddata_want)
Classes ‘data.table’ and 'data.frame':	2 obs. of  2 variables:
 $ V1: Factor w/ 2 levels "aa","bb": 1 2
 $ NA: Factor w/ 2 levels "1","2": 1 2
 - attr(*, ".internal.selfref")=<externalptr> 

The mapply operation made a matrix which forces all values to be the same mode:

> str( mapply(substr_typeswitch, startdata,
+  startpos, endpos, typelist) )
 chr [1:2, 1:2] "aa" "bb" "1" "2"
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:2] "V1" NA

You might have gotten something less homogeneous if you added the SIMPLIFY argument:

> str( mapply(substr_typeswitch, startdata,
+  startpos, endpos, typelist, SIMPLIFY=FALSE) )
List of 2
 $ V1: chr [1:2] "aa" "bb"
 $ NA: num [1:2] 1 2




> 
> Can anyone help me understand what I'm doing wrong?
> 
> Thank you!
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list