[R] subselecting on Data frame
PQuery
pierre.khoueiry at embl.de
Sat Aug 3 17:42:38 CEST 2013
Dear all,
I have a data frame of features (example pasted below) from which I would
like to select, say:
how many triplets of features (corresponding to rows) have the same Scaff
and the same "Cat" and a score >0.6 and fall in a distance of max 10000
(distance defined as Start of row[i+1] - End of row[i])
I've been trying that using selectors and combn in R but it is becoming
complicated.
Is there an intuitive way to achieve that elegantly ?
Many thanks,
Best,
Scaff Start End Score Cat
scaff_234 767099 767299 0.93 cat1
scaff_234 790221 790421 0.924 cat1
scaff_234 1341263 1341463 0.845 cat2
scaff_234 1543343 1543543 0.715 cat2
scaff_234 1551844 1552044 0.967 cat1
scaff_234 1560829 1561029 0.825 cat2
scaff_234 1580868 1581068 0.929 cat3
scaff_234 1589612 1589812 0.744 cat3
scaff_234 1597306 1597885 0.864 cat2
scaff_234 1598617 1599091 0.908 cat2
scaff_234 1613500 1613700 0.705 cat2
scaff_234 1614297 1614643 0.748 cat1
scaff_234 1623852 1624052 0.799 cat2
scaff_234 1669873 1670073 0.691 cat2
scaff_234 1670210 1670515 0.904 cat1
scaff_234 1822690 1822890 0.918 cat2
scaff_234 1824905 1825105 0.854 cat2
scaff_234 1826092 1826292 0.95 cat2
scaff_234 1855240 1855457 0.962 cat2
scaff_234 1872803 1873106 0.97 cat2
scaff_234 1894767 1894967 0.945 cat1
scaff_234 1903338 1903538 0.854 cat3
scaff_234 1920157 1920509 0.739 cat1
scaff_234 1944032 1944232 0.871 cat2
scaff_234 1976753 1976953 0.847 cat2
scaff_234 1992677 1992877 0.694 cat2
scaff_234 2007772 2007972 0.916 cat2
scaff_234 2009638 2010167 0.945 cat2
--
View this message in context: http://r.789695.n4.nabble.com/subselecting-on-Data-frame-tp4672992.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list