[R] Adding complex new columns to data frame depending on existing column

arun smartpink111 at yahoo.com
Mon Feb 4 03:22:37 CET 2013



Hi,

May be this helps:
df1<-read.table(text="
V1    V2    V3        V4        V5 V6
chr1 18884  C        CAAAA      2  0
chr1 135419 TATACA  T          2  0
chr1 332045 T      TTG      0  2
chr1 453838 T      TAC      2  0
chr1 567652 T        TG      1  0
chr1 602541 TTTA    T          2  0
",header=TRUE,sep="",stringsAsFactors=FALSE)
 df1$newCol1<- ifelse(nchar(df1$V3)>1,df1$V2-1,ifelse(nchar(df1$V3)==1 & nchar(df1$V4)>1, df1$V2,NA)) 
 df1$newCol2<- ifelse(nchar(df1$V3)>1,df1$V2+nchar(df1$V3)+1,ifelse(nchar(df1$V3)==1 & nchar(df1$V4)>1, df1$V2+2,NA)) 


df1
#    V1     V2     V3    V4 V5 V6 newCol1 newCol2
#1 chr1  18884      C CAAAA  2  0   18884   18886
#2 chr1 135419 TATACA     T  2  0  135418  135426
#3 chr1 332045      T   TTG  0  2  332045  332047
#4 chr1 453838      T   TAC  2  0  453838  453840
#5 chr1 567652      T    TG  1  0  567652  567654
#6 chr1 602541   TTTA     T  2  0  602540  602546
A.K.
----- Original Message -----
From: Tom Oates <toates19 at gmail.com>
To: r-help at r-project.org
Cc: 
Sent: Sunday, February 3, 2013 12:20 PM
Subject: [R] Adding complex new columns to data frame depending on existing column

Hello

I have a data frame as below
V1     V2     V3        V4         V5 V6
chr1 18884  C         CAAAA      2  0
chr1 135419 TATACA  T          2  0
chr1 332045 T       TTG      0  2
chr1 453838 T       TAC      2  0
chr1 567652 T        TG      1  0
chr1 602541 TTTA    T          2  0

on which I want to perform complex rearrangement such that:

if V3 is a string >1 (i.e line 2) then I generate 2 new columns where
first new column = V2-1 & second new column = V2+(length of string in V3)+1

therefore, for line 2 output would look like:
chr1 135419 TATACA  T   2  0 135418 135426

if length of string in V3 = 1 and V4=string of length>1 (i.e. line 1) then
first new column = V2 & second new column = V2+2

output for line 1 would be:
chr1 18884  C         CAAAA      2  0 18884  18886

I am not sure:
a) how to use R to substitute the length of the string in V3 with the
number representing this length
b) whether apply would be best to use here
Thanks

    [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list