[R] millions of comparisons, speed wanted
Adrian DUSA
adi at roda.ro
Thu Dec 15 21:04:01 CET 2005
Dear Andy,
On Thursday 15 December 2005 20:57, Liaw, Andy wrote:
> Just some untested idea:
> If the data are all 0/1, you could use dist(input, method="manhattan"), and
> then check which entry equals 1. This should be much faster than creating
> all pairs of rows and check position-by-position.
Thanks for the idea, I played a little with it. At the beginning yes, the data
are all 0/1, but during the minimizing iterations there are also "x" values;
for example comparing:
0 1 0 1 1
0 0 0 1 1
should return
0 "x" 0 1 1
whereas
0 "x" 0 1 1
0 0 0 1 1
shouldn't even be compared (they have different number of figures).
Replacing "x" with NA in dist is not yielding results either, as with
NA 0 0 1 1
0 0 0 1 1
dist returns 0.
I even wanted to see if I could tweak the dist code, but it calls a C program
and I gave up.
Nice idea anyhow, maybe I'll find a way to use it further.
Best,
Adrian
--
Adrian DUSA
Romanian Social Data Archive
1, Schitu Magureanu Bd
050025 Bucharest sector 5
Romania
Tel./Fax: +40 21 3126618 \
+40 21 3120210 / int.101
More information about the R-help
mailing list