簡體   English   中英

dplyr 根據多重過濾為列賦值

[英]dplyr assign value to column based on multiple filtering

我有一個包含以下列的數據框: cityamenitydate ,我想last添加一個基於將cityamenity分組在一起的列,然后取最近的日期。

輸入數據框:

| city      | amenity       | date                |
|-----------|---------------|---------------------|
| rome      | stadium       | 2020-02-25 19:10:40 | 
| new york  | concert hall  | 2020-03-09 18:15:29 |
| rome      | stadium       | 2020-02-29 15:07:23 |
| stockholm | swimming pool | 2020-03-02 11:23:54 |
| new york  | skate park    | 2020-03-12 13:41:35 |
| stockholm | swimming pool | 2020-03-13 17:54:23 |
| stockholm | swimming pool | 2020-03-18 19:18:29 |

所需的輸出:

| city      | amenity       | date                | last |
|-----------|---------------|---------------------|------|
| rome      | stadium       | 2020-02-25 19:10:40 |      |
| new york  | concert hall  | 2020-03-09 18:15:29 | TRUE |
| rome      | stadium       | 2020-02-29 15:07:23 | TRUE |
| stockholm | swimming pool | 2020-03-02 11:23:54 |      |
| new york  | skate park    | 2020-03-12 13:41:35 | TRUE |
| stockholm | swimming pool | 2020-03-13 17:54:23 |      |
| stockholm | swimming pool | 2020-03-18 19:18:29 | TRUE |

數據

df <- structure(list(city = c("rome", "newyork", "rome", "stockholm", 
"newyork", "stockholm", "stockholm"), amenity = c("stadium", 
"concert_hall", "stadium", "swimming_pool", "skate_park", "swimming_pool", 
"swimming_pool"), date = structure(c(1582632640, 1583752529, 
1582963643, 1583123034, 1583995295, 1584096863, 1584533909), class = c("POSIXct", 
"POSIXt"), tzone = "")), row.names = c(NA, -7L), class = "data.frame")

未經測試,因為數據不能輕易復制到 R 中,但類似這樣。

data %>%
  group_by(city, amenity) %>%
  mutate(last = (date == max(date)))

假設您的數據按date排序, df$last = !duplicated(df[, c("city", "amenity")], fromLast = TRUE) 這將使FALSE而不是缺失值,但應該有效。

使用dplyr

df %>%
  group_by(city, amenity) %>% 
  mutate(
    last = if_else(date == max(date), TRUE, NA)
  )

我設置了NA而不是FALSE因為在您想要的輸出中,非最后日期沒有值。


輸出

# A tibble: 7 x 4
# Groups:   city, amenity [4]
  city      amenity       date                last 
  <chr>     <chr>         <dttm>              <lgl>
1 rome      stadium       2020-02-25 13:10:40 NA   
2 newyork   concert_hall  2020-03-09 12:15:29 TRUE 
3 rome      stadium       2020-02-29 09:07:23 TRUE 
4 stockholm swimming_pool 2020-03-02 05:23:54 NA   
5 newyork   skate_park    2020-03-12 07:41:35 TRUE 
6 stockholm swimming_pool 2020-03-13 11:54:23 NA   
7 stockholm swimming_pool 2020-03-18 13:18:29 TRUE 

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM