簡體   English   中英

如何在`dplyr`中的`summarise`操作后保留列

[英]How to keep columns after `summarise` operation in `dplyr`

我有這種類型的數據:

df <- data.frame(name = c("Acer laurinum", "Acer laurinum Hassk.", "Acmella paniculata", 
                          "Adinandra cf. integerrima", "Adinandra cf. integerrima T.Anderson"),
                 value1 = c(1,2,3,4,5),
                 value2 = c(2,3,4,5,6))

我想根據列name的匹配部分summarisevalue1value2保留新列author的唯一值。 這段代碼只做了總結部分,但author不見了:

df %>%
  mutate(author = str_extract(name, "(?<=\\s)(?=.*\\.)[.\\w]+$"),
         name1 = trimws(str_remove(name, "(?<=\\s)(?=.*\\.)[.\\w]+$"))) %>%
  group_by(name1) %>%
  summarise(across(c(value1, value2), sum))

# A tibble: 3 x 3
  name1                     value1 value2
* <chr>                      <dbl>  <dbl>
1 Acer laurinum                  3      5
2 Acmella paniculata             3      4
3 Adinandra cf. integerrima      9     11

預期輸出:

# A tibble: 3 x 3
  name1                     value1 value2      author
* <chr>                      <dbl>  <dbl>       <chr>
1 Acer laurinum                  3      5       Hassk.
2 Acmella paniculata             3      4        <NA>
3 Adinandra cf. integerrima      9     11  T.Anderson

您可以使用na.omit(author)[1]獲取組中author第一個非 NA 值。

library(dplyr)
library(stringr)

df %>%
  mutate(author = str_extract(name, "(?<=\\s)(?=.*\\.)[.\\w]+$"),
         name1 = trimws(str_remove(name, "(?<=\\s)(?=.*\\.)[.\\w]+$"))) %>%
  group_by(name1) %>%
  summarise(across(c(value1, value2), sum), 
            author = na.omit(author)[1])

#  name1                     value1 value2 author    
#  <chr>                      <dbl>  <dbl> <chr>     
#1 Acer laurinum                  3      5 Hassk.    
#2 Acmella paniculata             3      4 NA        
#3 Adinandra cf. integerrima      9     11 T.Anderson

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM