[R] For help in R coding
Gabor Grothendieck
ggrothendieck at gmail.com
Sat Jul 2 01:11:32 CEST 2011
On Fri, Jul 1, 2011 at 12:47 PM, Bansal, Vikas <vikas.bansal at kcl.ac.uk> wrote:
> Dear all,
>
> I am doing a project on variant calling using R.I am working on pileup file.There are 10 columns in my data frame and I want to count the number of A,C,G and T in each row for column 9.example of column 9 is given below-
>
> .a,g,,
> .t,t,,
> .,c,c,
> .,a,,,
> .,t,t,t
> .c,,g,^!.
> .g,ggg.^!,
> .$,,,,,.,
> a,g,,t,
> ,,,,,.,^!.
> ,$,,,,.,.
>
> This is a bit confusing for me as these characters are in one column and how can we scan them for each row to print number of A,C,G and T for each row.
> Most of the rows have . and , and other symbols but we will ignore them.I just want to run a loop with a counter which will count the number of A,C,G and T for each row and will give output something like this-
>
>
> A C G T
> 1 0 1 0
> 0 0 0 2
> 0 2 0 0
> 1 0 0 0
> 0 0 0 3
>
> This output is for first 5 rows from the example given above.
>
Read the lines into L and then remove all but each of a, c, g and t
computing the number of characters in the remaining character strings:
Lines <- ".a,g,,
.t,t,,
.,c,c,
.,a,,,
.,t,t,t
.c,,g,^!.
.g,ggg.^!,
.$,,,,,.,
a,g,,t,
,,,,,.,^!.
,$,,,,.,."
L <- readLines(textConnection(Lines))
data.frame(a = nchar(gsub("[^a]", "", L)),
c = nchar(gsub("[^c]", "", L)),
g = nchar(gsub("[^g]", "", L)),
t = nchar(gsub("[^t]", "", L))
)
--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com
More information about the R-help
mailing list