简体   繁体   English

如何删除 r 中的 NA

[英]how to remove NA in r

I want to separate the hashtags in one column into different columns.我想将一列中的主题标签分成不同的列。 After I use "separate" function, I have a lot of NA's when I do a ggplot.在我使用“单独的”function 之后,当我做一个 ggplot 时,我有很多 NA。 How can I remove the NA's in my ggplot?如何删除 ggplot 中的 NA? my code is like this:我的代码是这样的:

df %>% 
  separate(terms, into = paste0("t", 1:5), sep = ";") %>% 
  pivot_longer(-year) %>% 
  group_by(year, value) %>% 
  count(value) %>% 
  ggplot(aes(x = factor(year), y = n, fill = value, label = NA)) +
  geom_col(position = position_dodge()) +
  geom_text(position = position_dodge(1))

my data is like this:我的数据是这样的:

    terms     year
1   #A;#B;#C;#D;E 2017
2   #B;#C;#D     2016
3   #C;#D;#E#G    2021
4   #D;#E;#F     2020

... ...

Try tidyr::separate_rows instead:尝试tidyr::separate_rows代替:

library(tidyverse)
df %>%
  separate_rows(terms, sep = ";") %>%
  group_by(year, terms) %>% 
  count(terms) %>% 
ggplot(aes(x = factor(year), y = n, fill = terms, label = NA)) +
  geom_col(position = position_dodge()) +
  geom_text(aes(label = terms), position = position_dodge(1))

在此处输入图像描述

You might also want to include tidyr::complete :您可能还想包括tidyr::complete

df %>%
  separate_rows(terms, sep = ";") %>%
  group_by(year, terms) %>% 
  count(terms) %>% 
  ungroup() %>%
  complete(year, terms, fill = list(n = 0)) %>%
ggplot(aes(x = factor(year), y = n, fill = terms, label = NA)) +
  geom_col(position = position_dodge(preserve = "single")) +
  scale_fill_discrete(drop = FALSE) +
  scale_x_discrete(drop = FALSE) +
  geom_text(aes(label = n), size = 3, position = position_dodge(width = 1))

在此处输入图像描述

Or with only the top 3 terms labeled:或者只标记前 3 个术语:

df %>%
  separate_rows(terms, sep = ";") %>%
  group_by(year, terms) %>% 
  count(terms) %>% 
  ungroup() %>%
  complete(year, terms, fill = list(n = 0))  -> new_df

ggplot(new_df, aes(x = factor(year), y = n, fill = terms, label = NA)) +
  geom_col(position = position_dodge(preserve = "single")) +
  scale_fill_discrete(drop = FALSE) +
  scale_x_discrete(drop = FALSE) +
  geom_text(data = new_df %>%
              group_by(year) %>%
              mutate(n = case_when(rank(-n,ties.method = "random") <= 3 ~ n,
                                   TRUE ~ NA_real_)),
            aes(label = terms), size = 3, position = position_dodge(width = 1))

在此处输入图像描述

Sample Data:样本数据:

df <- structure(list(terms = c("#A;#B;#C;#D;#E", "#C;#D;#E", "#B;#C;#D", 
"#A", "#C;#D;#E;#G", "#D;#E;#F", "#D"), year = c(2017L, 2017L, 
2016L, 2016L, 2021L, 2020L, 2020L)), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7"))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM