[R] memory problem [cluster]
Martin Maechler
maechler at stat.math.ethz.ch
Tue Dec 5 10:04:42 CET 2006
>>>>> "Roger" == Roger Bivand <Roger.Bivand at nhh.no>
>>>>> on Sat, 2 Dec 2006 22:11:12 +0100 (CET) writes:
Roger> On Sat, 2 Dec 2006, Dylan Beaudette wrote:
>> Hi Stephano,
Roger> Looks like you used my example verbatim
Roger> (http://casoilresource.lawr.ucdavis.edu/drupal/node/221)
Roger> :)
>> From exchanges on R-sig-geo, I believe the original questioner is feeding
Roger> NAs to clara, and the error message in clara() is overrunning the buffer
Roger> in sprintf(), so the memory problem isn't correctly identified. Using
Roger> scripts out of context without checking whether the input data frame
Roger> satifies the conditions of the functions being used is asking for trouble.
Roger> The error message:
>> traceback()
Roger> 2: stop(ngettext(length(i), sprintf("Observation %d has", i[1]),
Roger> sprintf("Observations %s have", paste(i, collapse = ","))),
Roger> " *only* NAs --> omit for clustering")
Roger> 1: clara(morph, k = 5, stand = F)
Roger> is coming from lines:
Roger> i[1]), sprintf("Observations %s have", paste(i,
Roger> collapse = ","))), " *only* NAs --> omit for clustering")
Roger> in clara(). I have suggested dropping those rows from the data frame in a
Roger> reply on R-sig-geo, but maybe clara() could be patched to count the # of
Roger> completely missing rows, and if # is more than a modest number, not print
Roger> the obs. numbers, just the total?
Yes, thanks Roger, for the hint; I have now done that
(will be in cluster_1.11.4):
> data(xclara)
> xclara[sample(nrow(xclara), 50),] <- NA
> clara(xclara, k = 3)
Error in clara(xclara, k = 3) : 50 observations (6,95,106,191,258,294,295,321,432,601,662,702 ...)
have *only* NAs --> na.omit() them for clustering!
Lessons to be learned (I have learned it earlier; but not
scrutinized all my code to see if it's obeyed :-):
- Inside stop(..) be careful not produce another error;
particularly not a memory-related one, since this will give
user-error messages that are not at all helpful.
- All non-beginner R users should be trained to routinely say
'traceback()' after they've seen an error.
Regards,
Martin Maechler, ETH Zurich
More information about the R-help
mailing list