[英]how to remove NA in r
I want to separate the hashtags in one column into different columns.我想将一列中的主题标签分成不同的列。 After I use "separate" function, I have a lot of NA's when I do a ggplot.
在我使用“单独的”function 之后,当我做一个 ggplot 时,我有很多 NA。 How can I remove the NA's in my ggplot?
如何删除 ggplot 中的 NA? my code is like this:
我的代码是这样的:
df %>%
separate(terms, into = paste0("t", 1:5), sep = ";") %>%
pivot_longer(-year) %>%
group_by(year, value) %>%
count(value) %>%
ggplot(aes(x = factor(year), y = n, fill = value, label = NA)) +
geom_col(position = position_dodge()) +
geom_text(position = position_dodge(1))
my data is like this:我的数据是这样的:
terms year
1 #A;#B;#C;#D;E 2017
2 #B;#C;#D 2016
3 #C;#D;#E#G 2021
4 #D;#E;#F 2020
... ...
Try tidyr::separate_rows
instead:尝试
tidyr::separate_rows
代替:
library(tidyverse)
df %>%
separate_rows(terms, sep = ";") %>%
group_by(year, terms) %>%
count(terms) %>%
ggplot(aes(x = factor(year), y = n, fill = terms, label = NA)) +
geom_col(position = position_dodge()) +
geom_text(aes(label = terms), position = position_dodge(1))
You might also want to include tidyr::complete
:您可能还想包括
tidyr::complete
:
df %>%
separate_rows(terms, sep = ";") %>%
group_by(year, terms) %>%
count(terms) %>%
ungroup() %>%
complete(year, terms, fill = list(n = 0)) %>%
ggplot(aes(x = factor(year), y = n, fill = terms, label = NA)) +
geom_col(position = position_dodge(preserve = "single")) +
scale_fill_discrete(drop = FALSE) +
scale_x_discrete(drop = FALSE) +
geom_text(aes(label = n), size = 3, position = position_dodge(width = 1))
Or with only the top 3 terms labeled:或者只标记前 3 个术语:
df %>%
separate_rows(terms, sep = ";") %>%
group_by(year, terms) %>%
count(terms) %>%
ungroup() %>%
complete(year, terms, fill = list(n = 0)) -> new_df
ggplot(new_df, aes(x = factor(year), y = n, fill = terms, label = NA)) +
geom_col(position = position_dodge(preserve = "single")) +
scale_fill_discrete(drop = FALSE) +
scale_x_discrete(drop = FALSE) +
geom_text(data = new_df %>%
group_by(year) %>%
mutate(n = case_when(rank(-n,ties.method = "random") <= 3 ~ n,
TRUE ~ NA_real_)),
aes(label = terms), size = 3, position = position_dodge(width = 1))
Sample Data:样本数据:
df <- structure(list(terms = c("#A;#B;#C;#D;#E", "#C;#D;#E", "#B;#C;#D",
"#A", "#C;#D;#E;#G", "#D;#E;#F", "#D"), year = c(2017L, 2017L,
2016L, 2016L, 2021L, 2020L, 2020L)), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7"))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.