Thu Sep 23 16:22:18 CEST 2004
this is both a statistical and a R question...
what would the best way / test to detect an outlier value among a series of 10 to 30 values ? for instance if we have the following dataset: 10,11,12,15,20,22,25,30,500 I d like to have a way to identify the last data as an outlier (only one direction). One way would be to calculate abs(mean - median) and if elevated (to what extent ?) delete the extreme data then redo.. but is it valid to do so with so few data ? is the (trimmed mean - mean) more efficient ? if so, what would be the maximal tolerable value to use as a threshold ? (I guess it will be experiment dependent...) tests for skweness will probably required a larger dataset ?
