简体   繁体   English

发出时的变异和大小写 - dplyr

[英]Mutate and case when issue - dplyr

I have the following data and I was to make a new column using mutate which details when colour = 'g' then take the level on the g row minus the level figure on the 'r' row.我有以下数据,我将使用 mutate 创建一个新列,当 color = 'g' 时的详细信息然后取 g 行上的级别减去 'r' 行上的级别数字。

Then likewise with type.然后同样用类型。 Where type = 1 then take the corresponding level minus the level on the type 2 row.其中 type = 1 然后取相应的级别减去类型 2 行上的级别。

library(dplyr)

d <- tibble(
  date = c("2018", "2018", "2018", "2019", "2019", "2019", "2020", "2020", "2020", "2020"),
  colour = c("none","g", "r", "none","g", "r", "none", "none", "none", "none"),
  type = c("type1", "none", "none", "type2", "none", "none", "none", "none", "none", "none"),
  level= c(78, 99, 45, 67, 87, 78, 89, 87, 67, 76))

Just to be clear this is what I want the data to look like.需要明确的是,这就是我希望数据的样子。

So the data should look like this:所以数据应该是这样的:

d2 <- tibble(
    date = c("2018", "2018", "2018", "2019", "2019", "2019", "2020", "2020", "2020", "2020"),
    colour = c("none","g", "r", "none","g", "r", "none", "none", "none", "none"),
    type = c("type1", "none", "none", "type2", "none", "none", "none", "none", "none", "none"),
    level= c(78, 99, 45, 67, 87, 78, 89, 87, 67, 76),
  color_gap = c("NULL", 44, "NULL", "NULL", 9, "NULL", "NULL", "NULL", "NULL", "NULL"),
  type_gap = c(11, "NULL", "NULL", "NULL", "NULL", "NULL", "NULL", "NULL", "NULL", "NULL"))

I started to use mutate and case when and got to the below.我开始使用 mutate 和 case when 并到达下面。 However, I'm stuck on the final calculation part.但是,我被困在最后的计算部分。 How do I say I want to take the color g level - the color r level?怎么说我要拿色g级——色级r?

d %>% 
  mutate(color_gap = case_when(color == "g" ~ level)%>%
 mutate(type_gap = case_when(type== "type1" ~ level)%>%
  ) -> d2

Anyone know how to complete this?有谁知道如何完成这个?

Thanks谢谢

This subtracts the first r level from the first g level, second r level from second g level, etc. Same for type1 and type2.这从第一个 g 级别中减去第一个 r 级别,从第二个 g 级别中减去第二个 r 级别,等等。对于 type1 和 type2 相同。 This has no checks at all.这根本没有检查。 It doesn't check whether there is a matching r for each g, whether they are in the expected order, whether they are in the same date-group, etc. It assumes the data is already perfectly formatted as expected, so be careful using this on real data.它不检查每个 g 是否有匹配的 r,它们是否按预期顺序,它们是否在同一个日期组中等。它假设数据已经按预期完美格式化,所以要小心使用这是基于真实数据的。

d %>% 
  mutate(color_gap = replace(rep(NA, n()), colour == 'g', 
                             level[colour == 'g'] - level[colour == 'r']),
         type_gap = replace(rep(NA, n()), type == 'type1', 
                             level[type == 'type1'] - level[type == 'type2']))
# # A tibble: 10 x 6
#    date  colour type  level color_gap type_gap
#    <chr> <chr>  <chr> <dbl>     <dbl>    <dbl>
#  1 2018  none   type1    78        NA       11
#  2 2018  g      none     99        54       NA
#  3 2018  r      none     45        NA       NA
#  4 2019  none   type2    67        NA       NA
#  5 2019  g      none     87         9       NA
#  6 2019  r      none     78        NA       NA
#  7 2020  none   none     89        NA       NA
#  8 2020  none   none     87        NA       NA
#  9 2020  none   none     67        NA       NA
# 10 2020  none   none     76        NA       NA

you could do this with group_by and mutate.你可以用group_by和 mutate 来做到这一点。

I assumed that there is only 1 row per date that would satisfy each condition.我假设每个date只有 1 行可以满足每个条件。

d %>% 
  mutate(color_gap = case_when(colour == "g" ~ level)) %>%
  mutate(type_gap = case_when(type== "type1" ~ level)) %>%
  group_by(date) %>%
  mutate(diff = max(color_gap,na.rm=T)-max(type_gap, na.rm=T))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM