[英]Mutate to create two number columns from one number column based on a character value
My data "x" looks like this:我的数据“x”如下所示:
date type cost
20-01 supp 5
20-02 supp 10
20-03 supp 5
20-01 svcs 2
20-02 svcs 4
20-03 svcs 8
I want to create a cost for each type in order to plot a multiple time series, which I can do by creating two separate time series, but what I'd like to do is create:我想为每种类型创建一个成本,以便 plot 多个时间序列,我可以通过创建两个单独的时间序列来做到这一点,但我想做的是创建:
bydate <- aggregate(cbind(supp, svcs)~date, data=y, FUN=sum)
With my data "y" looking like this:我的数据“y”看起来像这样:
date type supp svcs
20-01 supp 5 0
20-02 supp 10 0
20-03 supp 5 0
20-01 svcs 0 2
20-02 svcs 0 4
20-03 svcs 0 8
In this way I hope to create我希望通过这种方式创造
df <- bydate %>%
select(date, supp, svcs) %>%
gather(key = "variable", value = "value", -date)
Is the mutate function the way to do this?变异 function 是这样做的吗?
We have to create index variables before pivot
ing, and then pivot_wider with the valueS_fill
argument set to 0
.我们必须在
pivot
之前创建索引变量,然后将valueS_fill
参数设置为0
的 pivot_wider。
library(tidyr)
library(dplyr)
df %>%
mutate(index = row_number(),
type2 = type) %>%
pivot_wider(names_from = type2, values_from = cost, values_fill = 0) %>%
select(-index)
# A tibble: 6 × 4
date type supp svcs
<chr> <chr> <dbl> <dbl>
1 20-01 supp 5 0
2 20-02 supp 10 0
3 20-03 supp 5 0
4 20-01 svcs 0 2
5 20-02 svcs 0 4
6 20-03 svcs 0 8
Here is a version using bind_cols
这是使用
bind_cols
的版本
library(dplyr)
library(tidyr)
x %>%
dplyr::mutate(id = row_number()) %>%
pivot_wider(
names_from = type,
values_from = cost,
values_fill = 0
) %>%
bind_cols(type = x$type) %>%
select(date, type, everything(), -id)
date type supp svcs
<chr> <chr> <int> <int>
1 20-01 supp 5 0
2 20-02 supp 10 0
3 20-03 supp 5 0
4 20-01 svcs 0 2
5 20-02 svcs 0 4
6 20-03 svcs 0 8
For this kind of problem we do not necessarily need data rectangling.对于这类问题,我们不一定需要数据矩形。 An alternative is to use
purrr::map_dfc
inside dplyr::mutate
together with purrr::set_names
:另一种方法是在
dplyr::mutate
中使用purrr::map_dfc
和purrr::set_names
:
library(dplyr)
library(purrr)
df %>%
mutate(map_dfc(set_names(unique(type)),
~ ifelse(.x == type, cost, 0))
)
#> date type cost supp scvs
#> 1 20-01 supp 5 5 0
#> 2 20-02 supp 10 10 0
#> 3 20-03 supp 5 5 0
#> 4 20-01 scvs 2 0 2
#> 5 20-02 scvs 4 0 4
#> 6 20-03 scvs 8 0 8
To simplify this and similar problems I have a package on Github.为了简化这个和类似的问题,我在 Github 上有一个 package。 In this case we could use
over
together with dist_values
:在这种情况下,我们可以将
over
与dist_values
一起使用:
library(dplyr)
library(dplyover) # https://github.com/TimTeaFan/dplyover
df %>%
mutate(over(dist_values(type),
~ ifelse(.x == type, cost, 0))
)
#> date type cost scvs supp
#> 1 20-01 supp 5 0 5
#> 2 20-02 supp 10 0 10
#> 3 20-03 supp 5 0 5
#> 4 20-01 scvs 2 2 0
#> 5 20-02 scvs 4 4 0
#> 6 20-03 scvs 8 8 0
Created on 2021-12-30 by the reprex package (v0.3.0)由代表 package (v0.3.0) 于 2021 年 12 月 30 日创建
data:数据:
df <- data.frame(
date = rep(c("20-01", "20-02", "20-03"), 2),
type = rep(c("supp", "scvs"), each = 3),
cost = c(5, 10, 5, 2, 4, 8)
)
Created on 2021-12-30 by the reprex package (v0.3.0)由代表 package (v0.3.0) 于 2021 年 12 月 30 日创建
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.