[英]Adding observations in a table and attribute a given observation (join)
Good afternoon,下午好,
I'm analyzing the distribution of observations in a given month, for example:我正在分析给定月份的观察分布,例如:
Date Observations日期观察
2010-01 10 2010-01 10
2010-03 15 2010-03 15
2010-05 16 2010-05 16
Question: How do I insert the missing dates (2010-02 and 2010-05) in the table (using other table with all the monthly dates) and attribute a 0 as observations.问题:如何在表中插入缺失的日期(2010-02 和 2010-05)(使用包含所有月度日期的其他表)并将 0 属性作为观察值。
Thanks in advance.提前致谢。
We convert the 'Date' to Date
class, then use complete
expand the dataset by getting the sequence of min/max
or first
, last
'Date' by
'1 month' while fill
ing the 'Observations' with 0我们将“日期”转换为
Date
class,然后使用complete
的数据集扩展数据集,方法是获取min/max
或first
, last
“日期” by
“1 个月”,同时用 0 fill
“观察”
library(tidyr)
library(dplyr)
df1 %>%
mutate(Date = as.Date(Date)) %>%
complete(Date = seq(first(Date), last(Date), by = '1 month'),
fill = list(Observations = 0))
If there is another dataset with complete 'Date', then the obvious option is a left_join
and then replace the NA
elements in 'Observations' with 0 because by default if we don't have a match, it will be filled with NA
如果有另一个具有完整“日期”的数据集,那么显而易见的选项是
left_join
,然后将“观察”中的NA
元素替换为 0,因为默认情况下,如果我们没有匹配项,它将用NA
填充
left_join(df2, df1, by = 'Date') %>%
mutate(Observations = replace_na(Observations, 0))
NOTE: df2
is the dataset with complete 'Date'注意:
df2
是具有完整“日期”的数据集
In case, if the 'df2' have other columns as well, we don't need to select
those columns如果'df2'还有其他列,我们不需要
select
这些列
left_join(df2 %>%
select(Date), df1) %>%
mutate(Observations = replace_na(Observations, 0))
In base R
, we can use merge
在
base R
中,我们可以使用merge
transform(merge(df2, df1, by = 'Date', all.x = TRUE),
Observations = replace(Observations, is.na(Observations), 0))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.