[BioC] what is the best way to get scores for matches from	matchPWM() ?
    Lucas Carey 
    lucas.carey at gmail.com
       
    Wed Jan 20 17:40:24 CET 2010
    
    
  
Hi All,
I'm wondering what is the best way to get the score for every match
from matchPWM() in Biostrings
Right now, to score all matches to pwm in genome I do this:
#Find PWM hits for fwd & reverse complement of PWM for all chromosomes in genome
mmf <- sapply(1:Nchr,
function(chr){matchPWM(pwm,genome[[chr]],min.score=cutoff) }  )
mmr <- sapply(1:Nchr,
function(chr){matchPWM(reverseComplement(pwm),genome[[chr]],min.score=cutoff)
}  )
mmm <- c(mmf,mmr)
#Extract the sequences. RevComp where necessary.
Sequences <-  c( rapply(mmf,as.character,how='unlist'),
sapply(rapply(mmr,as.character,how='unlist'),function(x){c2s(rev(comp(s2c(x))))})
)
#convert to DNAStringSet for in order to score. This is quite slow
lcl_set  <- DNAStringSet(as.character(Sequences))
Scores  <- sapply(lcl_set,PWMscoreStartingAt,pwm=pwm)
This is incredibly inefficient. What is the best way to do this?
thanks
-Lucas
    
    
More information about the Bioconductor
mailing list