[R] Mismatch distribution
Boris Steipe
bor|@@@te|pe @end|ng |rom utoronto@c@
Tue Jan 22 03:52:09 CET 2019
Myriam -
This is the right list in principle, all the packages you use are CRAN packages, not Bioconductor.
However I am at a loss as to how you wrote your code: both pegas and seqinr have "read.<something>()" functions, but neither has read.dna(); similarly both pegas and seqinr have "dist.<something>()" functions, but neither has dist.gene(). Did you just extrapolate those function names and parameters from other function calls?
In any case: please start from a minimal, reproducible example that comes close to what you are trying to achieve, then post again. Here are the three URLs we usually recommend to get things started. Use a small number of small example files, don't nest your expressions until you are sure they produce what you think they do, and take it step by step.
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
http://adv-r.had.co.nz/Reproducibility.html
https://cran.r-project.org/web/packages/reprex/index.html (read the vignette)
Cheers,
B
-
> On 2019-01-21, at 21:08, Bert Gunter <bgunter.4567 using gmail.com> wrote:
>
> "Do not work" does not work (in providing sufficient info). See the Posting
> guide linked below for how to post an intelligible question.
>
> HOWEVER, I suspect you would do better posting on te Bioconductor list
> where they are much more likely to know what "fasta" files look like and
> might even have software already developed to do what you want. You could
> well be trying to reinvent wheels.
>
> Cheers,
> Bert
>
>
> On Mon, Jan 21, 2019 at 5:35 PM Myriam Croze <myriam.croze07 using gmail.com>
> wrote:
>
>> Hello!
>>
>> I need your help. I am trying to calculate the pairwise differences between
>> sequences from several fasta files.
>> I would like for each of my DNA alignments (fasta files), calculate the
>> pairwise differences and then:
>> - 1. Combine all the data of each file to have one file and one histogram
>> (mismatch distribution)
>> - 2. calculate the mean for each difference for all the file and again make
>> a mismatch distribution plot
>>
>> Here the script that I wrote:
>>
>> library("pegas")
>>> library("seqinr")
>>> library("ggplot2")
>>>
>>>
>>
>>> Files <- list.files(pattern="fas")
>>> nb_files <- length(Files)
>>>
>>>
>>> for (i in 1:nb_files) {
>>> Dist <- as.numeric(dist.gene(read.dna(Files[i], "fasta"), method
>>> = "pairwise",
>>> pairwise.deletion = FALSE, variance = FALSE))
>>>
>>> Data <- merge(Data, Dist, by=c("x"), all=T)
>>> }
>>>
>>
>>
>>> hist(Data, prob=TRUE)
>>> lines(density(Data), col="blue", lwd=2)
>>>
>>
>> However, the script does not work and I do not know what to change to make
>> it working.
>> Thanks in advance for your help.
>>
>> Myriam
>>
>> --
>> Myriam Croze, PhD
>> Post-doctorante
>> Division of EcoScience,
>> Ewha Womans University
>> Seoul, South Korea
>>
>> Email: myriam.croze07 using gmail.com
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list