[BioC] rtracklayer gff import
Hans-Rudolf Hotz
hrh at fmi.ch
Fri Apr 15 09:29:49 CEST 2011
On 04/15/2011 12:14 AM, Cook, Malcolm wrote:
>
>> rtracklayer currently considers GFF3 files to be right-open.
>> The GFF3 spec
>> states that start is always<= end, and that zero-width
>> intervals have start
>> == end.
>
> yes but 1 width intervals also have start = end
>
>> To me, this suggests that they are right-open.
>> Otherwise, you need
>> some other way to distinguish zero vs. one width intervals,
>> which is crazy.
>
> yes - it is crazy
it might be 'crazy'....but it has been always like this:
GFF (and its extensions like gtf or gff3 ) are "end inclusive" (or right
closed), see:
http://www.sanger.ac.uk/resources/software/gff/spec.html
http://genome.ucsc.edu/FAQ/FAQformat.html#format3
http://genome.ucsc.edu/FAQ/FAQformat.html#format4
and
http://www.sequenceontology.org/gff3.shtml
and the latest GFF3 definition explains very well how to treat
:zero-length features:
"For zero-length features, such as insertion sites, start equals end
and the implied site is to the right of the indicated base in the
direction of the landmark."
yes, as a consequence, you have to pay attention to the 'value' of the
third column to figure out whether this could be a zero-length feature.
But in practice, this has always been obvious to me. Also, I hardly work
with GFF/GTF/GFF3 files which have different kind of features, I usually
split by the third column an then treat each feature according to its
meaning.
My two cents....
Regards, Hans
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list