[R] how to split row elements [1] and [2] of a string variable A via srtsplit and sapply

jim holtman jholtman at gmail.com
Thu Sep 10 20:05:01 CEST 2015


try this:


> x <- read.table(text = "A          B
+  1:29439275 0.46773514
+  5:85928892 0.81283052
+  10:128341232 0.09332543
+  1:106024283:ID 0.36307805
+  3:62707519 0.42657952
+  2:80464120 0.89125094", header = TRUE, as.is = TRUE)
>
> temp <- strsplit(x$A, ":")
> x$C <- sapply(temp, '[[', 1)
> x$D <- sapply(temp, '[[', 2)
>
> x
               A          B  C         D
1     1:29439275 0.46773514  1  29439275
2     5:85928892 0.81283052  5  85928892
3   10:128341232 0.09332543 10 128341232
4 1:106024283:ID 0.36307805  1 106024283
5     3:62707519 0.42657952  3  62707519
6     2:80464120 0.89125094  2  80464120




Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

On Thu, Sep 10, 2015 at 1:46 PM, aldi <aldi at wustl.edu> wrote:

> Hi,
> I have a data.frame x1, of which a variable A needs to be split by
> element 1 and element 2 where separator is ":". Sometimes could be three
> elements in A, but I do not need the third element.
>
> Since R does not have a SCAN function as in SAS, C=scan(A,1,":");
> D=scan(A,2,":");
> I am using a combination of strsplit and sapply. If I do not use the
> index [i] then R captures the full vector . Instead I need row by row
> capturing the first and the second element and from them create two new
> variables C and D.
> Right now as is somehow in the loop i C is captured correctly, but D is
> missing because the variables AA does not have it. Any suggestions?
> Thank you in advance, Aldi
>
> A          B
> 1:29439275 0.46773514
> 5:85928892 0.81283052
> 10:128341232 0.09332543
> 1:106024283:ID 0.36307805
> 3:62707519 0.42657952
> 2:80464120 0.89125094
>
> x1<-read.table(file='./test.txt',head=T,sep='\t')
> x1$A <- as.character(x1$A)
>
> for(i in 1:length(x1$A)){
>
> x1$AA[i] <- as.numeric(unlist(strsplit(x1$A[i],':')))
>
> x1$C[i] <- sapply(x1$AA[i],function(x)x[1])
> x1$D[i] <- sapply(x1$AA[i],function(x)x[2])
> }
>
> x1
>
>
>
>  > x1
>                 A          B AA  C  D
> 1     1:29439275 0.46773514  1  1 NA
> 2     5:85928892 0.81283052  5  5 NA
> 3   10:128341232 0.09332543 10 10 NA
> 4 1:106024283:ID 0.36307805  1  1 NA
> 5     3:62707519 0.42657952  3  3 NA
> 6     2:80464120 0.89125094  2  2 NA
>
>
> --
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list