[R] Plotting graph for Missing values

jim holtman jholtman at gmail.com
Mon Jan 26 15:44:32 CET 2009


>From your original posting:

> I tried the code which u provided.
> In place of "dos" in command "pat1 <- rbinom(length(dos), 1, .5)  # generate
> some data"
> I added "patientinformation1" variable and then I gave the command for
> "tapply" but its giving me the following error:
>
> Error in tapply(pat1, format(dos, "%Y%m"), function(x) sum(x == 0)) :
>   arguments must have same length

I would say that "pat1" and "dos" were not of the same length.  Check
your code and objects to verify this; that is what the error message
is saying.  You said you added the "patientinformation1" variable, but
it does not seem to appear in the error message.

On Sun, Jan 25, 2009 at 11:48 PM, Shreyasee <shreyasee.pradhan at gmail.com> wrote:
> Hi Jim,
>
> I run the following code
>
> ds <- read.csv(file="D:/Shreyasee laptop data/ASC Dataset/Subset of the ASC
> Dataset.csv", header=TRUE)
>> attach(ds)
>> str(dos)
>
> I am getting the following message:
>
>  Factor w/ 12 levels "0000-00-00","6-Aug",..: 6 6 6 6 6 6 6 6 6 6 ...
>
> Thanks,
> Shreyasee
>
>
>
> On Mon, Jan 26, 2009 at 12:20 PM, jim holtman <jholtman at gmail.com> wrote:
>>
>> do:
>>
>> str(dos)
>> str(patientinformation1)
>>
>> They must be the same length for the command to work: must be a one to
>> one match of the data.
>>
>> On Sun, Jan 25, 2009 at 10:23 PM, Shreyasee <shreyasee.pradhan at gmail.com>
>> wrote:
>> > Hi Jim,
>> >
>> > I tried the code which u provided.
>> > In place of "dos" in command "pat1 <- rbinom(length(dos), 1, .5)  #
>> > generate
>> > some data"
>> > I added "patientinformation1" variable and then I gave the command for
>> > "tapply" but its giving me the following error:
>> >
>> > Error in tapply(pat1, format(dos, "%Y%m"), function(x) sum(x == 0)) :
>> >   arguments must have same length
>> >
>> >
>> > Thanks,
>> > Shreyasee
>> >
>> >
>> >
>> > On Mon, Jan 26, 2009 at 10:50 AM, jim holtman <jholtman at gmail.com>
>> > wrote:
>> >>
>> >> YOu can save the output of the tapply and then replicate it for each
>> >> of the variables.  The data can be used to plot the graphs.
>> >>
>> >> On Sun, Jan 25, 2009 at 9:38 PM, Shreyasee
>> >> <shreyasee.pradhan at gmail.com>
>> >> wrote:
>> >> > Hi Jim,
>> >> >
>> >> > I need to calculate the missing values in variable
>> >> > "patientinformation1"
>> >> > for
>> >> > the period of May 2006 to March 2007 and then plot the graph of the
>> >> > percentage of the missing values over these months.
>> >> > This has to be done for each variable.
>> >> > The code which you have provided, calculates the missing values for
>> >> > the
>> >> > months variable, am I right?
>> >> > I need to calculate for all the variables for each month.
>> >> >
>> >> > Thanks,
>> >> > Shreyasee
>> >> >
>> >> >
>> >> > On Mon, Jan 26, 2009 at 10:29 AM, jim holtman <jholtman at gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> Here is an example of how you might approach it:
>> >> >>
>> >> >> > dos <- seq(as.Date('2006-05-01'), as.Date('2007-03-31'), by='1
>> >> >> > day')
>> >> >> > pat1 <- rbinom(length(dos), 1, .5)  # generate some data
>> >> >> > # partition by month and then list out the number of zero values
>> >> >> > (missing)
>> >> >> > tapply(pat1, format(dos, "%Y%m"), function(x) sum(x==0))
>> >> >> 200605 200606 200607 200608 200609 200610 200611 200612 200701
>> >> >> 200702
>> >> >> 200703
>> >> >>    21     22     16     18     16     15     16     17     14     16
>> >> >> 13
>> >> >> >
>> >> >>
>> >> >>
>> >> >> On Sun, Jan 25, 2009 at 8:51 PM, Shreyasee
>> >> >> <shreyasee.pradhan at gmail.com>
>> >> >> wrote:
>> >> >> > Hi Jim,
>> >> >> >
>> >> >> > The dataset has 4 variables (dos, patientinformation1,
>> >> >> > patientinformation2,
>> >> >> > patientinformation3).
>> >> >> > In dos variable ther are months (May 2006 to March 2007) when the
>> >> >> > surgeries
>> >> >> > were formed.
>> >> >> > I need to calculate the percentage of missing values for each
>> >> >> > variable
>> >> >> > (patientinformation1, patientinformation2, patientinformation3)
>> >> >> > for
>> >> >> > each
>> >> >> > month.
>> >> >> > I need a common script to calculate that for each variable.
>> >> >> >
>> >> >> > Thanks,
>> >> >> > Shreyasee
>> >> >> >
>> >> >> >
>> >> >> > On Mon, Jan 26, 2009 at 9:46 AM, jim holtman <jholtman at gmail.com>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> What does you data look like?  You could use 'split' and then
>> >> >> >> examine
>> >> >> >> the data in each range to count the number missing.  Would have
>> >> >> >> to
>> >> >> >> have some actual data to suggest a solution.
>> >> >> >>
>> >> >> >> On Sun, Jan 25, 2009 at 8:30 PM, Shreyasee
>> >> >> >> <shreyasee.pradhan at gmail.com>
>> >> >> >> wrote:
>> >> >> >> > Hi,
>> >> >> >> >
>> >> >> >> > I have imported one dataset in R.
>> >> >> >> > I want to calculate the percentage of missing values for each
>> >> >> >> > month
>> >> >> >> > (May
>> >> >> >> > 2006 to March 2007) for each variable.
>> >> >> >> > Just to begin with I tried the following code :
>> >> >> >> >
>> >> >> >> > *for(i in 1:length(dos))
>> >> >> >> > for(j in 1:length(patientinformation1)
>> >> >> >> > if(dos[i]=="May-06" && patientinformation1[j]=="")
>> >> >> >> > a <- j+1
>> >> >> >> > a*
>> >> >> >> >
>> >> >> >> > The above code was written to calculate the number of missing
>> >> >> >> > values
>> >> >> >> > for
>> >> >> >> > May
>> >> >> >> > 2006, but I am not getting the correct results.
>> >> >> >> > Can anybody help me?
>> >> >> >> >
>> >> >> >> > Thanks,
>> >> >> >> > Shreyasee
>> >> >> >> >
>> >> >> >> >        [[alternative HTML version deleted]]
>> >> >> >> >
>> >> >> >> > ______________________________________________
>> >> >> >> > R-help at r-project.org mailing list
>> >> >> >> > https://stat.ethz.ch/mailman/listinfo/r-help
>> >> >> >> > PLEASE do read the posting guide
>> >> >> >> > http://www.R-project.org/posting-guide.html
>> >> >> >> > and provide commented, minimal, self-contained, reproducible
>> >> >> >> > code.
>> >> >> >> >
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> --
>> >> >> >> Jim Holtman
>> >> >> >> Cincinnati, OH
>> >> >> >> +1 513 646 9390
>> >> >> >>
>> >> >> >> What is the problem that you are trying to solve?
>> >> >> >
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Jim Holtman
>> >> >> Cincinnati, OH
>> >> >> +1 513 646 9390
>> >> >>
>> >> >> What is the problem that you are trying to solve?
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Jim Holtman
>> >> Cincinnati, OH
>> >> +1 513 646 9390
>> >>
>> >> What is the problem that you are trying to solve?
>> >
>> >
>>
>>
>>
>> --
>> Jim Holtman
>> Cincinnati, OH
>> +1 513 646 9390
>>
>> What is the problem that you are trying to solve?
>
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?




More information about the R-help mailing list