繁体   English   中英

用另一列中的值替换数据框中的占位符值

[英]Replacing placeholder values in dataframe with values from another column

我有一个看起来像这样的数据框:

df <-
  structure(
    list(
      Exception1 = c(
        "Comments from {2}: {0}",
        "status updated to {1} by {2}. Description:{0}",
        "status updated to {1} by {2}. Description:{0}",
        "information only.",
        "status updated to {1} by {2}. Description:{0}",
        "status updated to {1} by {2}. Description:{0}"
      ),
      Exception2 = c(
        "Customer {0} said bla",
        "Status updated to {1}",
        "Customer said {2}",
        "User {0} foo",
        "{0} {1}",
        "{1} {2}"
      ),
      ARGUMENT1 = c("OK", " ", " ", "PAY9089723089-98391", " ", " "),
      ARGUMENT2 = c(
        "null",
        "Processing",
        "Reconciled",
        "null",
        "Processing",
        "Reconciled"
      ),
      ARGUMENT3 = c(
        "company name",
        "company name",
        "company name",
        "null",
        "company name",
        "company name"
      )
    ),
    row.names = c(NA, 6L),
    class = "data.frame"
  )

| Exception1                                    | Exception2            | ARGUMENT1           | ARGUMENT2  | ARGUMENT3    |
|-----------------------------------------------|-----------------------|---------------------|------------|--------------|
| Comments from {2}: {0}                        | Customer {0} said bla | OK                  | null       | company name |
| status updated to {1} by {2}. Description:{0} | Status updated to {1} |                     | Processing | company name |
| status updated to {1} by {2}. Description:{0} | Customer said {2}     |                     | Reconciled | company name |
| information only.                             | User {0} foo          | PAY9089723089-98391 | null       | null         |
| status updated to {1} by {2}. Description:{0} | {0} {1}               |                     | Processing | company name |
| status updated to {1} by {2}. Description:{0} | {1} {2}               |                     | Reconciled | company name |

Exception1 和 Exception 2 列(为了便于阅读,我删除了更多的 Exception 列)包含占位符 {},这些占位符将替换为 ARGUMENT* 列中的值。

我一直在寻找实现这一目标的方法,并且取得了相对成功,但我仍然缺乏将其做得更好的经验。

我写了一个简单的函数,通过 gsub 进行替换:

excp_ren2 <- function(x) {
  x %<>%
    gsub("\\{1\\}", x["ARGUMENT2"], .) %>%
    gsub("\\{0\\}", x["ARGUMENT1"], .) %>%
    gsub("\\{2\\}", x["ARGUMENT3"], .)
  x
}

然后一直在使用 apply 及其差异。 例如,我已经完成了一个 OK 结果:

new_df <-
  df %>% apply(
    .,
    MARGIN = 1,
    FUN = function(x)
      excp_ren2(x)
  ) %>% as.data.frame()

唯一的问题是这转置了矩阵,这并不是真正的问题。

我正在寻找更好的方法来做到这一点,我以为我可以通过 mutate_* 做到这一点,但我想我无法访问函数内行的列名,或者至少我不知道该怎么做它。 关于更简单的方法来实现这一点的任何想法?

谢谢!

我们可以在管道中使用str_replace (矢量化),而不是按行执行此操作(并在每列上应用函数而不是“Exception1”)

library(stringr)
library(dplyr)
df %>%
  transmute(new =  str_replace_all(Exception1, "\\{1\\}", ARGUMENT2) %>% 
                   str_replace_all("\\{0\\}", ARGUMENT1) %>% 
                   str_replace_all("\\{2\\}", ARGUMENT3))
#                                                  new
#1                                  Comments from company name: OK
#2 status updated to Processing by company name. \\nDescription:\n
#3 status updated to Reconciled by company name. \\nDescription:\n
#4                  PCard order invoices are for information only.
#5 status updated to Processing by company name. \\nDescription:\n
#6 status updated to Reconciled by company name. \\nDescription:\n

如果我们有多个列,我们可以利用的mutate_attransmute_at

df %>%
   transmute_at(vars(starts_with("Exception")), ~ 
           str_replace_all(., "\\{1\\}", ARGUMENT2) %>% 
                   str_replace_all("\\{0\\}", ARGUMENT1) %>% 
                   str_replace_all("\\{2\\}", ARGUMENT3))
#                    Exception1                   Exception2
#1                              Comments from company name: OK         Customer OK said bla
#2 status updated to Processing by company name. Description:  Status updated to Processing
#3 status updated to Reconciled by company name. Description:    Customer said company name
#4                                           information only. User PAY9089723089-98391 foo
#5 status updated to Processing by company name. Description:                    Processing
#6 status updated to Reconciled by company name. Description:       Reconciled company name

也许像这样

clean_pipe <- . %>% 
  mutate(new_string = Exception1 %>% str_replace_all(pattern = "\\{0\\}",replacement = ARGUMENT1)) %>% 
  mutate(new_string = new_string %>% str_replace_all(pattern = "\\{1\\}",replacement = ARGUMENT2)) %>% 
  mutate(new_string = new_string %>% str_replace_all(pattern = "\\{2\\}",replacement = ARGUMENT3))

df %>% 
  clean_pipe

你使用{ }来划界的方式让我想到了使用glue ,它以类似的方式运作。 要制作与数据中的列名称匹配的粘合模板,请首先使用stringr::str_replace_all的命名列表,一步一步地将模式与替换项进行匹配。 然后从"Exception*"列创建glue对象。 基于这篇文章( R dplyr: rowwise + mutate (+glue) - how to get/refer row content? ),你需要使用rowwise因为否则它会尝试使用每个模板的每个参数列的所有值. 我想将两个mutate_at步骤都放在一个函数中,但在确定范围时遇到了一些麻烦,所以这是我能做的最简单的工作。

library(dplyr)
library(tidyr)

replacements <- c("\\{1\\}" = "{ARGUMENT2}",
                  "\\{0\\}" = "{ARGUMENT1}",
                  "\\{2\\}" = "{ARGUMENT3}")

as_tibble(df) %>%
  rowwise() %>%
  mutate_at(vars(starts_with("Exception")), stringr::str_replace_all, replacements) %>%
  mutate_at(vars(starts_with("Exception")), ~as.character(glue::glue(.)))
#> Source: local data frame [6 x 5]
#> Groups: <by row>
#> 
#> # A tibble: 6 x 5
#>   Exception1                  Exception2       ARGUMENT1     ARGUMENT2 ARGUMENT3
#>   <chr>                       <chr>            <chr>         <chr>     <chr>    
#> 1 Comments from company name… Customer OK sai… OK            null      company …
#> 2 "status updated to Process… Status updated … " "           Processi… company …
#> 3 "status updated to Reconci… Customer said c… " "           Reconcil… company …
#> 4 information only.           User PAY9089723… PAY908972308… null      null     
#> 5 "status updated to Process… "  Processing"   " "           Processi… company …
#> 6 "status updated to Reconci… Reconciled comp… " "           Reconcil… company …

请注意,由于某些字符串为空,因此您的结果中有额外的空格,您可以使用trimws进行修剪。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM