简体   繁体   English

如何 plot 每个级别的一个因素

[英]How to plot each level of a factor

I am trying to make a plot where each level of a factor gets its own series.我正在尝试制作 plot ,其中每个级别的因素都有自己的系列。 While I am a long time user of R I am not up with some of the latest improvements.虽然我是 R 的长期用户,但我不了解一些最新的改进。 For example I have not yet learned ggplot which figures in some related questions but I cannot yet translate what I want to do into ggplot.例如,我还没有学习 ggplot 在一些相关问题中的数字,但我还不能将我想做的事情翻译成 ggplot。 Here is a simple example:这是一个简单的例子:

#library(tidyverse) # uncomment if not loaded

in_data <- read_csv("http://www.nfgarland.ca/National_Custom_Data.csv")
in_data <- in_data %>% 
  mutate(Tot = in_data$`NUM INFLUENZA DEATHS`+in_data$`NUM PNEUMONIA DEATHS`) %>% 
  arrange(SEASON) %>%
  mutate(SEASON = factor(SEASON,ordered=TRUE)) 

filter(in_data,SEASON == "2015-16")$Tot %>% plot((1:length(.)),
                                             ., 
                                             type = "l",
                                             col = "red",
                                             xlab ="Flu Season Week",
                                             ylab = "Deaths",
                                             ylim = c(2000,7500))
filter(in_data,SEASON == "2016-17")$Tot %>% lines((1:length(.)),., col="orange")
filter(in_data,SEASON == "2017-18")$Tot %>% lines((1:length(.)),. ,col="blue")
filter(in_data,SEASON == "2018-19")$Tot %>% lines((1:length(.)),. ,col="green")
filter(in_data,SEASON == "2019-20")$Tot %>% lines((1:length(.)),., ,col="black")

` As you can see I have learned a number of tidyverse concepts and this code works fine. ` 如您所见,我已经学习了许多 tidyverse 概念,并且这段代码运行良好。 But I assume there really ought to be a way to do this automagically in the tidyverse without defining each and every lines() separately, I would think, and I cannot identify it.但是我认为确实应该有一种方法可以在 tidyverse 中自动执行此操作,而无需单独定义每一行(),我想,我无法识别它。 I do know how to handle palettes, so the color changes are no problem.我确实知道如何处理调色板,所以颜色变化没有问题。 Note also that while there are 52 weeks of data for previous seasons, in this file there are only 24 weeks gone in the present flu season year.另请注意,虽然前几个季节有 52 周的数据,但在此文件中,当前流感季节年仅剩 24 周。

How about like this?像这样怎么样?

library(ggplot2)
ggplot(in_data, aes(x=WEEK,y=Tot, color = SEASON)) + 
  geom_line() + 
  labs(x = "Flu Season Week", y = "Deaths") +
  ylim(2000,7500) + 
  scale_color_manual(values = c("red","goldenrod","blue","orange","green"))

在此处输入图像描述

Edit: Addressing OP's comment about wanting to break the 2019-20 data, we can use a quick pivot to fill in the missing values.编辑:解决 OP 关于想要破坏 2019-20 数据的评论,我们可以使用快速 pivot 来填充缺失值。

in_data %>% dplyr::select(SEASON,Tot,WEEK) %>%
  tidyr::pivot_wider(names_from = SEASON, values_from = Tot) %>%
  pivot_longer(cols = (-WEEK), names_to = "SEASON", values_to = "Tot") %>%
ggplot(aes(x=WEEK,y=Tot, color = SEASON)) + 
  geom_line() + 
  labs(x = "Flu Season Week", y = "Deaths") +
  ylim(2000,7500) + 
  scale_color_manual(values = c("red","goldenrod","blue","orange","green"))

在此处输入图像描述

You need to use a for loop, and of course, unlike ggplot2, you got to specify legends as well.您需要使用 for 循环,当然,与 ggplot2 不同,您还必须指定图例。 Below is a suggestion in base R (good old days) you can do:以下是基础 R (过去好日子)中的建议,您可以这样做:

library(readr)
library(dplyr)

COLS = c("red","goldenrod","blue","orange","green")
names(COLS) = levels(in_data$SEASON)

plot(NULL,xlim=range(in_data$WEEK),ylim=range(in_data$Tot),
xlab="time",ylab="Tot")
for(nu in levels(in_data$SEASON)){
lines(1:sum(in_data$SEASON == nu),
in_data$Tot[in_data$SEASON == nu],
col = COLS[nu])
}

legend("topright",fill=COLS,names(COLS))

在此处输入图像描述

If you need to specify the weeks, since like you mentioned in the comment, it goes from week 40+ to next year.. it might be a bit more code (and maybe pain)如果您需要指定周数,因为就像您在评论中提到的那样,它从第 40 周到明年......它可能需要更多的代码(也许还有痛苦)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM