dplyr 组操作添加 na

Question

Here are my data:这是我的数据：

places <- c("London", "London", "London", "Paris", "Paris", "Rennes")
years <- c(2019, 2019, 2020, 2019, 2019, 2020)

dataset <- data.frame(years, places)

The result:结果：

    years   places
1   2019    London
2   2019    London
3   2020    London
4   2019    Paris
5   2019    Paris
6   2020    Rennes

I am counting by place and years我按地点和年份计算

dataset2 <-  dataset %>%
count(places, years)



 places  years  n
1   London  2019    2
2   London  2020    1
3   Paris   2019    2
4   Rennes  2020    1

I want my table to show the two years for each city even if there are no values.即使没有值，我也希望我的表格显示每个城市的两年。

 places  years  n
 1   London 2019    2
 2   London 2020    1
 3   Paris  2019    2
 4   Paris   2020    NA  # or better 0
 5   Rennes  2019    NA  # or better 0
 6   Rennes 2020    1

Answer 1

You could use complete from tidyr to fill in missing sequence:您可以使用complete的tidyr来填写缺失的序列：

library(dplyr)
library(tidyr)

dataset %>% count(places, years) %>% complete(places, years, fill = list(n = 0))

If you convert years to factor you can specify .drop = FALSE .如果将years转换为factor ，则可以指定.drop = FALSE 。

dataset %>% mutate(years = factor(years)) %>% count(places, years, .drop = FALSE)

#  places years     n
#  <fct>  <fct> <int>
#1 London 2019      2
#2 London 2020      1
#3 Paris  2019      2
#4 Paris  2020      0
#5 Rennes 2019      0
#6 Rennes 2020      1

Answer 2

We can use CJ from data.table我们可以使用来自data.table的CJ

library(data.table)
 setDT(dataset)[, .N, .(years, places)][CJ(years, places, unique = TRUE), on = .(years, places)]

dplyr 组操作添加 na

问题描述

2 个解决方案

解决方案1
2 已采纳 2020-04-19 14:10:16

解决方案2
0 2020-04-19 17:17:14

dplyr 组操作添加 na

问题描述

2 个解决方案

解决方案1 2 已采纳 2020-04-19 14:10:16

解决方案2 0 2020-04-19 17:17:14

解决方案1
2 已采纳 2020-04-19 14:10:16

解决方案2
0 2020-04-19 17:17:14