[R] GGPlot plot

Wed Jul 18 21:53:45 CEST 2018

On Wed, 18 Jul 2018, Francesca wrote:

> Dear R help,
>

> I am new to ggplot so I apologize if my question is a bit obvious.

Or perhaps not, as this is the "R-help" mailing list, not the 
"Ggplot-help" mailing list. Fortunately for you, what you really need to 
learn is R, and then ggplot will be much easier to get along with.

> I would like to create a plot where a compare the fraction of the values 
> of a variable called PASP out of the number of subjects, for two groups 
> of subject codified with a dummy variable called SUBJC.
>
> The variable PASP is discrete and only takes values 0,4,8..
>
> My data are as following:
>
> PASP   SUBJC
> 
> 0          0
>
> 4          1
>
> 0          0
>
> 8          0
>
> 4          0
>
> 0          1
>
> 0          1
>
> .           .
>
> .           .
>
> .           .
>
>
>
>
> I would like to calculate the fraction of positive levels of PASP out of 
> the total number of observations, divided per values of SUBJ=0 and 1. I 
> am new to the use of GGPlot and I do not know how to organize the data 
> and what to use to summarize these data as to obtain a picture as 
> follows:
>
>
>
>
>
> I hope my request is clear. Thanks for any help you can provide.

The funky text formatting and reference to "picture as follows" of the 
above makes me think you composed this in HTML and then converted it to 
plain text without looking at the result.

* We got no picture.. this is a plain-text-only mailing list.
* HTML makes terrible plain text.

The following is an example of how you can send us sample data and code in 
the body of your email that will survive these plain-text-only 
limitations. Note that writing R code is the key to communicating 
unambiguously.

You can start by preparing a sample of your data (usually not all of 
it)doing something like

dput(head(mydta,100))

and inserting the "dta <- " with the output so you get a line of R code 
that we can execute and have some rows of your data:

-----
dta <- structure(list(PASP = c(0, 12, 8, 0, 12, 12, 12, 8, 12, 8, 8,
8, 8, 4, 0, 12, 12, 0, 12, 0, 0, 12, 4, 8, 12, 8, 4, 4, 4, 4,
8, 8, 8, 12, 12, 12, 8, 0, 12, 12, 0, 12, 12, 8, 0, 4, 4, 12,
8, 8, 12, 8, 0, 12, 0, 0, 4, 0, 0, 4, 4, 12, 0, 4, 8, 8, 8, 4,
0, 0, 4, 0, 12, 4, 12, 12, 8, 0, 0, 0, 4, 8, 8, 0, 4, 0, 12,
4, 12, 0, 4, 12, 8, 0, 4, 0, 0, 12, 12, 8), SUBJC = c(0L, 1L,
0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 1L, 1L,
1L, 1L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L,
1L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 1L, 1L, 0L, 1L, 0L, 0L,
0L, 1L, 1L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L,
0L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 1L,
1L, 0L)), .Names = c("PASP", "SUBJC"), row.names = c(NA, -100L
), class = "data.frame")
-----

and then ideally you would tell us the results of a sample of the 
calculation you expect to see, though in this case you might not have 
thought to present them organized as below:

-----
result <- read.table( text =
" PASP SUBJC  Fraction
     0     0      0.279
     4     0      0.186
     8     0      0.395
    12     0      0.140
     0     1      0.263
     4     1      0.211
     8     1      0.123
    12     1      0.404
", header=TRUE)
-----

And with your existing text, we might come up with something like:

-----
library(ggplot2)

dta <- structure(list(PASP = c(0, 12, 8, 0, 12, 12, 12, 8, 12, 8, 8,
8, 8, 4, 0, 12, 12, 0, 12, 0, 0, 12, 4, 8, 12, 8, 4, 4, 4, 4,
8, 8, 8, 12, 12, 12, 8, 0, 12, 12, 0, 12, 12, 8, 0, 4, 4, 12,
8, 8, 12, 8, 0, 12, 0, 0, 4, 0, 0, 4, 4, 12, 0, 4, 8, 8, 8, 4,
0, 0, 4, 0, 12, 4, 12, 12, 8, 0, 0, 0, 4, 8, 8, 0, 4, 0, 12,
4, 12, 0, 4, 12, 8, 0, 4, 0, 0, 12, 12, 8), SUBJC = c(0L, 1L,
0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 1L, 1L,
1L, 1L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L,
1L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 1L, 1L, 0L, 1L, 0L, 0L,
0L, 1L, 1L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L,
0L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 1L,
1L, 0L)), .Names = c("PASP", "SUBJC"), row.names = c(NA, -100L

         ), class = "data.frame")
table(dta)
#>     SUBJC
#> PASP  0  1
#>   0  12 15
#>   4   8 12
#>   8  17  7
#>   12  6 23

dtasum <- aggregate( list( Count = rep(1,100) )
                    , dta
                    , FUN = sum
                    )

dtasum$Fraction <- ave( dtasum$Count
                       , dtasum$SUBJC
                       , FUN = function(x) ( x/sum(x) )
                       )
dtasum$PASPfactor <- factor( dtasum$PASP )
dtasum$SUBJCfactor <- factor( dtasum$SUBJC )
dtasum
#>   PASP SUBJC Count  Fraction PASPfactor SUBJCfactor
#> 1    0     0    12 0.2790698          0           0
#> 2    4     0     8 0.1860465          4           0
#> 3    8     0    17 0.3953488          8           0
#> 4   12     0     6 0.1395349         12           0
#> 5    0     1    15 0.2631579          0           1
#> 6    4     1    12 0.2105263          4           1
#> 7    8     1     7 0.1228070          8           1
#> 8   12     1    23 0.4035088         12           1

ggplot( dtasum
       , aes( x=SUBJCfactor
            , y=Fraction
            , fill=PASPfactor
            )
       ) +
   geom_bar( stat = "identity" ) +
   xlab( "SUBJ" ) +
   scale_fill_discrete( name = "PASP" )

#' Created on 2018-07-18 by the [reprex package](http://reprex.tidyverse.org) (v0.2.0).
-----

Obviously, since I never saw the figure you thought I was going to see, 
the plot I made may not be the one you had in mind, but you should at 
least have some example code to compare with the "Introduction to R" 
document that comes with R, and some functions to look up help pages on, 
e.g.

?aggregate
?ave

and you can execute pieces of code to see what they create:

rep(1,100)

You should read he Posting Guide carefully, as there are hints in it as to 
how to do much of this.

>
> Francesca
>
>
>
> ______________________________________________

> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil using dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k