[R] Plot of a subset of a data.frame()

Petr PIKAL petr.pikal at precheza.cz
Tue Jul 27 12:33:02 CEST 2010


Hi

r-help-bounces at r-project.org napsal dne 27.07.2010 00:48:52:

> 
> On Jul 26, 2010, at 10:56 AM, Steffen Uhlig wrote:
> 
> > Dear David, Petr, and Alain,
> >
> > thank you very much for your fast responses. It's a typical 
> > "handbook-not-read-error" at my side. I will dig deeper into the 
> > plot-functions and the assignment of data. I was not aware of that 
> > the vector "a" is handled as a vector of factors with 10 levels. 
> > Thanks for your suggestions and hints!
> 
> You can prevent that behavior and instead get a character vector ... 
> at least from functions that return such ... by using stringsAsFactors 
> = FALSE within the data.frame call. You also have the option of 
> setting that globally which at least one well known institution has 
> adopted as the default policy for its work.
> 
> ?data.frame
> ?options

However he can also get used to factors and use their strengths like 
levels changing, using underlined numerical representation, levels 
combination and maybe some others.

Regards
Petr



> 
> -- 
> David
> >
> > Best regards,
> > /steffen
> >
> >
> > Am 26.07.2010 14:30, schrieb David Winsemius:
> >>
> >> On Jul 26, 2010, at 7:38 AM, Steffen Uhlig wrote:
> >>
> >>> Hello,
> >>>
> >>> my data.frame is sort of a collection of process values, i.e. huge
> >>> run-chart. It consists of a time-stamp in the first column (date as
> >>> string), factors in the following columns (used for subset- 
> >>> filtering),
> >>> and some process-data columns.
> >>> Hereafter, two examples are listed, showing the problems that occour
> >>> during print:
> >>>
> >>> At first the example, that works fine:
> >>>
> >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >>> a = c(1:10) # create a vector of integers
> >>> b = rep(c("a","b"),5) # create a vector of chars, used
> >>> # as factor-levels
> >>> d = rnorm(10) # some random numbers
> >>> e = data.frame(a,b,d) # connect to a data.frame
> >>
> >> You've gotten several answers, but none have addressed an aspect of R
> >> behavior that took me longer to appreciate than it perhaps should 
> >> have.
> >> The "b" column inside the "e" data.frame is now a factor column. I
> >> mention that because you later referred to it as a "string" which 
> >> it is
> >> not. It is an integer with an associated indexed level character 
> >> vector.
> >> Many of the functions that you might think would "work" on "strings"
> >> will give either errors or unexpected results when applied to 
> >> factors.
> >>
> >>
> >>>
> >>> e.1 = subset(e, b=="a") # create two subsets
> >>> e.2 = subset(e, b=="b")
> >>> plot(d~a, e.1, pch=3, col=2) # plot first data-subset
> >>> points(d~a, e.2, pch=4, col=3) # plot the 2nd one
> >>>
> >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >>> all looks fine in theses plots.
> >>>
> >>>
> >>> However, changing the content of vector "a" to a set of strings the
> >>> following happens:
> >>>
> >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >>> a = c("a","b","c","d","e","f","g","h","i","j")
> >>> e = data.frame(a,b,d) # re-build data.frame
> >>>
> >>> e.1 = subset(e, b=="a") # create two subsets
> >>> e.2 = subset(e, b=="b")
> >>> plot(d~a, e.1, pch=3, col=2)
> >>> points(d~a, e.2, pch=4, col=3)
> >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >>> The plot-command produces horizontal lines instead of dots. This 
> >>> seems
> >>> to happen when the x-axis contains strings rather than numbers. is
> >>> there a way out?
> >>>
> >>> Best regards,
> >>> /Steffen
> >
> >
> > -- 
> > Steffen Uhlig, PhD
> > Mechatronik und Sensortechnik
> > HTW des Saarlandes
> > Goebenstraße 40
> > 66117 Saarbrücken
> >
> > Tel.: +49 (0) 681 58 67 274
> 
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list