[R] GGPlot plot
Jeff Newmiller
jdnewm|| @end|ng |rom dcn@d@v|@@c@@u@
Wed Jul 18 21:53:45 CEST 2018
On Wed, 18 Jul 2018, Francesca wrote:
> Dear R help,
>
> I am new to ggplot so I apologize if my question is a bit obvious.
Or perhaps not, as this is the "R-help" mailing list, not the
"Ggplot-help" mailing list. Fortunately for you, what you really need to
learn is R, and then ggplot will be much easier to get along with.
> I would like to create a plot where a compare the fraction of the values
> of a variable called PASP out of the number of subjects, for two groups
> of subject codified with a dummy variable called SUBJC.
>
> The variable PASP is discrete and only takes values 0,4,8..
>
> My data are as following:
>
> PASP SUBJC
>
> 0 0
>
> 4 1
>
> 0 0
>
> 8 0
>
> 4 0
>
> 0 1
>
> 0 1
>
> . .
>
> . .
>
> . .
>
>
>
>
> I would like to calculate the fraction of positive levels of PASP out of
> the total number of observations, divided per values of SUBJ=0 and 1. I
> am new to the use of GGPlot and I do not know how to organize the data
> and what to use to summarize these data as to obtain a picture as
> follows:
>
>
>
>
>
> I hope my request is clear. Thanks for any help you can provide.
The funky text formatting and reference to "picture as follows" of the
above makes me think you composed this in HTML and then converted it to
plain text without looking at the result.
* We got no picture.. this is a plain-text-only mailing list.
* HTML makes terrible plain text.
The following is an example of how you can send us sample data and code in
the body of your email that will survive these plain-text-only
limitations. Note that writing R code is the key to communicating
unambiguously.
You can start by preparing a sample of your data (usually not all of
it)doing something like
dput(head(mydta,100))
and inserting the "dta <- " with the output so you get a line of R code
that we can execute and have some rows of your data:
-----
dta <- structure(list(PASP = c(0, 12, 8, 0, 12, 12, 12, 8, 12, 8, 8,
8, 8, 4, 0, 12, 12, 0, 12, 0, 0, 12, 4, 8, 12, 8, 4, 4, 4, 4,
8, 8, 8, 12, 12, 12, 8, 0, 12, 12, 0, 12, 12, 8, 0, 4, 4, 12,
8, 8, 12, 8, 0, 12, 0, 0, 4, 0, 0, 4, 4, 12, 0, 4, 8, 8, 8, 4,
0, 0, 4, 0, 12, 4, 12, 12, 8, 0, 0, 0, 4, 8, 8, 0, 4, 0, 12,
4, 12, 0, 4, 12, 8, 0, 4, 0, 0, 12, 12, 8), SUBJC = c(0L, 1L,
0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 1L, 1L,
1L, 1L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L,
1L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 1L, 1L, 0L, 1L, 0L, 0L,
0L, 1L, 1L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L,
0L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 1L,
1L, 0L)), .Names = c("PASP", "SUBJC"), row.names = c(NA, -100L
), class = "data.frame")
-----
and then ideally you would tell us the results of a sample of the
calculation you expect to see, though in this case you might not have
thought to present them organized as below:
-----
result <- read.table( text =
" PASP SUBJC Fraction
0 0 0.279
4 0 0.186
8 0 0.395
12 0 0.140
0 1 0.263
4 1 0.211
8 1 0.123
12 1 0.404
", header=TRUE)
-----
And with your existing text, we might come up with something like:
-----
library(ggplot2)
dta <- structure(list(PASP = c(0, 12, 8, 0, 12, 12, 12, 8, 12, 8, 8,
8, 8, 4, 0, 12, 12, 0, 12, 0, 0, 12, 4, 8, 12, 8, 4, 4, 4, 4,
8, 8, 8, 12, 12, 12, 8, 0, 12, 12, 0, 12, 12, 8, 0, 4, 4, 12,
8, 8, 12, 8, 0, 12, 0, 0, 4, 0, 0, 4, 4, 12, 0, 4, 8, 8, 8, 4,
0, 0, 4, 0, 12, 4, 12, 12, 8, 0, 0, 0, 4, 8, 8, 0, 4, 0, 12,
4, 12, 0, 4, 12, 8, 0, 4, 0, 0, 12, 12, 8), SUBJC = c(0L, 1L,
0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 1L, 1L,
1L, 1L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L,
1L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 1L, 1L, 0L, 1L, 0L, 0L,
0L, 1L, 1L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L,
0L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 1L,
1L, 0L)), .Names = c("PASP", "SUBJC"), row.names = c(NA, -100L
), class = "data.frame")
table(dta)
#> SUBJC
#> PASP 0 1
#> 0 12 15
#> 4 8 12
#> 8 17 7
#> 12 6 23
dtasum <- aggregate( list( Count = rep(1,100) )
, dta
, FUN = sum
)
dtasum$Fraction <- ave( dtasum$Count
, dtasum$SUBJC
, FUN = function(x) ( x/sum(x) )
)
dtasum$PASPfactor <- factor( dtasum$PASP )
dtasum$SUBJCfactor <- factor( dtasum$SUBJC )
dtasum
#> PASP SUBJC Count Fraction PASPfactor SUBJCfactor
#> 1 0 0 12 0.2790698 0 0
#> 2 4 0 8 0.1860465 4 0
#> 3 8 0 17 0.3953488 8 0
#> 4 12 0 6 0.1395349 12 0
#> 5 0 1 15 0.2631579 0 1
#> 6 4 1 12 0.2105263 4 1
#> 7 8 1 7 0.1228070 8 1
#> 8 12 1 23 0.4035088 12 1
ggplot( dtasum
, aes( x=SUBJCfactor
, y=Fraction
, fill=PASPfactor
)
) +
geom_bar( stat = "identity" ) +
xlab( "SUBJ" ) +
scale_fill_discrete( name = "PASP" )
#' Created on 2018-07-18 by the [reprex package](http://reprex.tidyverse.org) (v0.2.0).
-----
Obviously, since I never saw the figure you thought I was going to see,
the plot I made may not be the one you had in mind, but you should at
least have some example code to compare with the "Introduction to R"
document that comes with R, and some functions to look up help pages on,
e.g.
?aggregate
?ave
and you can execute pieces of code to see what they create:
rep(1,100)
You should read he Posting Guide carefully, as there are hints in it as to
how to do much of this.
>
> Francesca
>
>
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil using dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
More information about the R-help
mailing list