[R] Newbie: Controlling legends in graphs

Kevin Zembower kev|n @end|ng |rom zembower@org
Tue May 16 16:22:05 CEST 2023


Rui, thanks so much for your help. Your explanation and example were 
clear and concise. Thanks for taking the time and effort to help me.

-Kevin

On 5/12/23 16:06, Rui Barradas wrote:
> Às 14:24 de 12/05/2023, Kevin Zembower via R-help escreveu:
>> Hello, I'm trying to create a line graph with a legend, but have no
>> success controlling the legend. Since nothing I've tried seems to work,
>> I must be doing something systematically wrong. Can anyone point this
>> out to me?
>>
>> Here's my data:
>>   > weights
>> # A tibble: 1,246 × 3
>>      Date           J     K
>>      <date>     <dbl> <dbl>
>>    1 2000-02-13   133  188
>>    2 2000-02-20   134  185
>>    3 2000-02-27   135  187
>>    4 2000-03-05   135  185
>>    5 2000-03-12    NA  184
>>    6 2000-03-19    NA  184.
>>    7 2000-03-26   136  184.
>>    8 2000-04-02   134  185
>>    9 2000-04-09   133  186
>> 10 2000-04-16    NA  186
>> # ℹ 1,236 more rows
>> # ℹ Use `print(n = ...)` to see more rows
>>   >
>>
>> Here's my attempts. You can see some of the things I've tried in the
>> commented out sections:
>> weights %>%
>>       group_by(year(Date)) %>%
>>       summarize(
>>           m_K = mean(K, na.rm = TRUE),
>>           m_J = mean(J, na.rm = TRUE),
>>           ) %>%
>>       ggplot(aes(x = `year(Date)`)) +
>>       geom_point(aes(y = m_K, color = "red")) +
>>       geom_smooth(aes(y = m_K, color = "red")) +
>>       geom_point(aes(y = m_J, color = "blue")) +
>>       geom_smooth(aes(y = m_J, color = "blue")) +
>>       guides(size = "legend",
>>              shape = "legend")
>>       ## scale_shape_discrete(name="Person",
>>       ##                      breaks=c("m_K", "m_J"),
>>       ##                      labels=c("K", "J"))
>>       ## theme(legend.title=element_blank())
>>
>> When this runs, the blue line for "K" is above the red line for "J", as
>> I expect, but in the legend, the red is shown first, and labeled "blue."
>>
>> I'd like to be able to create a legend where the first entry shows a
>> blue line and is labeled "K" and the second is red and labeled "J".
>>
>> On a different but related topic, I'd welcome any advice or suggestions
>> on my methodology in this example. Is this the correct way to summarize
>> with a mean? Do I need the two sets of geom_point and geom_line clauses
>> to create this graph, or is there a better way?
>>
>> Thanks for all your advice and guidance.
>>
>> -Kevin
>>
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> Hello,
> 
> This is mainly a data reshaping problem. Insteadof plotting two 
> variables, J and K, if the data is in the long format you will map the 
> column with these variables names to the color aesthetic and call each 
> geom_* only once. Then, assign the colors you want.
> 
> As for placing K above J, note that ggplot places them by alphabetical 
> order unless you coerce to factor with the levels in the order you want.
> 
> Also, if you want to compute aggregate statistics for several columns, 
> use ?across. See the code below.
> 
> Here is a complete example. I have augmented your data set in order to 
> have more years to plot.
> 
> 
> 
> # augment the data set
> weights <- " Date           J     K
>    1 2000-02-13   133  188
>    2 2000-02-20   134  185
>    3 2000-02-27   135  187
>    4 2000-03-05   135  185
>    5 2000-03-12    NA  184
>    6 2000-03-19    NA  184.
>    7 2000-03-26   136  184.
>    8 2000-04-02   134  185
>    9 2000-04-09   133  186
> 10 2000-04-16    NA  186"
> weights <- read.table(text = weights, header = TRUE)
> weights$Date <- as.Date(weights$Date)
> tmp <- weights
> tmp <- lapply(1:10, \(y) {
>    tmp$Date <- years(y) + tmp$Date
>    tmp$J <- tmp$J + sample(-10:10, nrow(weights), TRUE)
>    tmp$K <- tmp$K + sample(-10:10, nrow(weights), TRUE)
>    tmp
> })
> weights <- do.call(rbind, tmp)
> 
> #---
> 
> # plot code
> library(ggplot2)
> library(dplyr)
> library(tidyr)
> library(lubridate)
> 
> weights %>%
>      mutate(Year = year(Date)) %>%
>      group_by(Year) %>%
>      summarize(across(J:K, mean, na.rm = TRUE)) %>%
>      # now reshape the data
>      pivot_longer(-Year) %>%
>      # uncomment the next line if you want K
>      # to show up on top in the legend
>      # mutate(name = factor(name, levels = c("K", "J"))) %>%
>      ggplot(aes(Year, value, color = name)) +
>      geom_smooth(
>          formula = y ~ x,
>          method = lm,
>          se = FALSE
>      ) +
>      geom_point() +
>      scale_color_manual(values = c(J = "red", K = "blue"))
> 
> 
> 
> Hope this helps,
> 
> Rui Barradas
> 




More information about the R-help mailing list