[R] how to remove outliers
Bert Gunter
gunter.berton at gene.com
Tue Jul 15 12:38:18 CEST 2014
No! Do not do this.
First, the syntax is wrong. Second, this will fail in general due to
floating point arithmetic. Use inequality with sufficient fuzz
instead.
e.g.
time <- time[time$TimeDiff < 14478,]
Moral: Caveat Emptor. Free advice may be worth exactly that.
Cheers,
Bert
Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374
"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
Clifford Stoll
On Mon, Jul 14, 2014 at 9:43 PM, Hasan Diwan <hasan.diwan at gmail.com> wrote:
> Marta,
> To remove a row from your data frame, use:
>
> value <- 14478.4
> time <- time[-time[$TimeDiff] == value,]
>
> I hope that helps... If not, do push back. -- H
>
>
> On 14 July 2014 09:17, Marta valdes lopez <martavaldes85 at gmail.com> wrote:
>
>> Hi!
>>
>> I did this test and I got this outlier that i would like to remove the
>> whole row in my database; anyone knows how i can remove it?
>>
>> chisq.out.test(time$TimeDiff)
>> chi-squared test for outlier
>> data: time$TimeDiff
>> X-squared = 73260.07, p-value < 2.2e-16
>> alternative hypothesis: highest value 14478.4 is an outlier
>>
>> Thank you!!
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Sent from my mobile device
> Envoyé de mon portable
>
> [[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list