[R] Handling missing data

Vassilis Golfinopoulos vassilis.golfinopoulos at gmail.com
Mon Sep 21 22:34:46 CEST 2009


No, this is part of my dataset. Anyway, this is unlikely to cause the 
problem. If there are few data, impute.knn actually uses mean imputation 
(and returns a warning).

----- Original Message ----- 
From: "Martin Morgan" <mtmorgan at fhcrc.org>
To: "Vassilis Golfinopoulos" <vassilis.golfinopoulos at gmail.com>
Cc: "Greg Snow" <Greg.Snow at imail.org>; <r-help at r-project.org>; "premmad" 
<mtechprem at gmail.com>
Sent: Monday, September 21, 2009 7:20 PM
Subject: Re: [R] Handling missing data


> Vassilis Golfinopoulos wrote:
>> Consider this sample dataset (displayed [1:3, 1:3]):
>>
>>           T1053B     T1102A      T1129A
>> AKT1  -0.02412174  0.1986057          NA
>> AURKA -0.37109748 -0.4418542  0.04967051
>> BRAF  -0.14589269 -0.1590310 -0.35483226
>>
>>> is.na(dataset[1, 3])
>> TRUE
>>
>> library(impute)
>> library(GeneMeta)
>>
>> imputed.dataset <- impute.knn(as.matrix(dataset))
>
> impute.knn has a second parameter k with default value 10, the number of
> nearest neighbors to use, in gene space, for imputation. For the example
> above, there are not 10 nearest neighbors, and unfortunately impute.knn
> does not check for this. Is this the case with your real data?
>
> This might address your problem with impute.knn, a GeneMeta example
> would help for progress on that front.
>
> Martin
>
>> CRASH!
>>
>>
>> 2009/9/21 Greg Snow <Greg.Snow at imail.org>:
>>> Help us to help you, show us the code that you tried, what you expected, 
>>> and what you saw.
>>>
>>> Does "using NA condition"  mean:
>>>
>>>> x == NA
>>> Which does not work
>>>
>>> Or
>>>
>>>> is.na(x)
>>> Which should.
>>>
>>> -----Original Message-----
>>> From: "premmad" <mtechprem at gmail.com>
>>> To: "r-help at r-project.org" <r-help at r-project.org>
>>> Sent: 9/21/09 12:38 AM
>>> Subject: [R]  Handling missing data
>>>
>>>
>>> I have to remove missing data both in character and numeric datatype.I 
>>> tried
>>> using NA condition but it is not working ,please help me to solve this.
>>> --
>>> View this message in context: 
>>> http://www.nabble.com/Handling-missing-data-tp25530192p25530192.html
>>> Sent from the R help mailing list archive at Nabble.com.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide 
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide 
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list