简体   繁体   English

对 dataframe 中的列执行一系列突变

[英]Perform a series of mutations to columns in dataframe

I am trying to replace some text in my dataframe (a few rows given below)我正在尝试替换 dataframe 中的一些文本(下面给出了几行)

> dput(Henry.longer[1:4,])
structure(list(N_l = c(4, 4, 4, 4), UG = c("100", "100", "100", 
"100"), S = c(12, 12, 12, 12), Sample = c(NA, NA, NA, NA), EQ = c("Henry", 
"Henry", "Henry", "Henry"), DF = c(0.798545454545455, 0.798545454545455, 
0.798545454545455, 0.798545454545455), meow = c("Henry.Exterior.single", 
"Multi", "Henry.Exterior.multi", "Henry.Interior.single"), Girder =     c("Henry.Exterior.single", 
"Henry.Interior.multi", "Henry.Exterior.multi", "Interior")), row.names = c(NA, 
-4L), groups = structure(list(UG = "100", S = 12, .rows = list(
1:4)), row.names = c(NA, -1L), class = c("tbl_df", "tbl", 
"data.frame"), .drop = FALSE), class = c("grouped_df", "tbl_df", 
"tbl", "data.frame"))

I try to mutate the dataframe as:我尝试将 dataframe 变异为:

Henry.longer <- Henry.longer %>% 
  mutate(Loading = str_replace(meow, "Henry.Exterior.single", "Single")) %>%
  mutate(Loading = str_replace(meow, "Henry.Exterior.multi", "Multi")) %>%
  mutate(Loading = str_replace(meow, "Henry.Interior.single", "Single")) %>%
  mutate(Loading = str_replace(meow, "Henry.Interior.multi", "Multi")) %>%
  mutate(Girder = str_replace(meow, "Henry.Exterior.multi", "Exterior")) %>%
  mutate(Girder = str_replace(meow, "Henry.Exterior.single", "Exterior")) %>%
  mutate(Girder = str_replace(meow, "Henry.Interior.multi", "Interior")) %>%
  mutate(Girder = str_replace(meow, "Henry.Interior.single", "Interior")) %>%
  select(-meow)

But for some reason the results does not get applied to all the rows and only:但由于某种原因,结果并未应用于所有行,并且仅适用于:

      N_l UG        S Sample EQ       DF Loading               Girder               
1     4 100      12 NA     Henry 0.799 Henry.Exterior.single Henry.Exterior.single
2     4 100      12 NA     Henry 0.799 Multi                 Henry.Interior.multi 
3     4 100      12 NA     Henry 0.799 Henry.Exterior.multi  Henry.Exterior.multi 
4     4 100      12 NA     Henry 0.799 Henry.Interior.single Interior

I think we can use lookup vectors for this, if it's easy or safer to use static string lookups:我认为我们可以为此使用查找向量,如果使用 static 字符串查找更容易或更安全:

tr_vec <- c(Henry.Exterior.single = "Single", Henry.Exterior.multi = "Multi", Henry.Interior.single = "Single", Henry.Interior.multi = "Multi")
tr_vec2 <- c(Henry.Exterior.multi = "Exterior", Henry.Exterior.single = "Exterior", Henry.Interior.multi = "Interior", Henry.Interior.single = "Interior")
Henry.longer %>%
  mutate(
    Loading = coalesce(tr_vec[Loading], Loading),
    Girder = coalesce(tr_vec2[Girder], Girder)
  )
# # A tibble: 4 x 8
# # Groups:   UG, S [1]
#     N_l UG        S Sample EQ       DF Loading Girder  
#   <dbl> <chr> <dbl> <lgl>  <chr> <dbl> <chr>   <chr>   
# 1     4 100      12 NA     Henry 0.799 Single  Exterior
# 2     4 100      12 NA     Henry 0.799 Multi   Interior
# 3     4 100      12 NA     Henry 0.799 Multi   Exterior
# 4     4 100      12 NA     Henry 0.799 Single  Interior

The advantage of RonakShah's regex solution is that it can very easily handle many of the types of substrings you appear to need. RonakShah 的正则表达式解决方案的优势在于它可以非常轻松地处理您似乎需要的许多类型的子字符串。 Regexes do carry a little risk, though, in that they may (unlikely in that answer, but) miss match.但是,正则表达式确实有一点风险,因为它们可能(不太可能在那个答案中,但是)错过匹配。

Instead of using str_replace I guess it would be easier to extract what you want using regex.而不是使用str_replace我想使用正则表达式提取你想要的东西会更容易。

library(dplyr)

Henry.longer %>%
  mutate(Loading = sub('.*\\.', '', meow), 
         Girder = sub('.*\\.(\\w+)\\..*', '\\1', meow))

where在哪里

Loading - removes everything until last dot Loading - 删除所有内容,直到最后一个点

Girder - extracts a word between two dots. Girder - 在两个点之间提取一个单词。

Oh boy, looks like you've got some answers here already but here's a super-simple one that uses stringr::str_extract :哦,男孩,看起来你已经在这里得到了一些答案,但这是一个使用stringr::str_extract的超级简单的答案:

Henry.longer <- Henry.longer %>%
  mutate(Loading = str_extract(meow, "single|multi")) %>%
  mutate(Girder = str_extract(meow, "Interior|Exterior"))

It's worth noting that the demo data has a weird entry for meow in one column, so it didn't run perfectly on my machine:值得注意的是,演示数据在一列中有一个奇怪的meow条目,所以它在我的机器上运行不完美:

输出

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 条件数据框突变 - Conditional dataframe mutations R中的条件数据帧突变与magrittr和dplyr - Conditional dataframe mutations in R with magrittr and dplyr 如何对 R 中数据框的所有列执行关键字搜索(不精确) - how to perform keyword search (not exact) on all columns of a dataframe in R 将多个时间序列列强制为大数据框中的因子 - Coercing multiple time-series columns to factors in large dataframe 合并数据帧的行并执行两列的加权求和 - merge rows of dataframe and perform weighted sums of two columns 执行 Fisher 测试,将多个数据框列与同一向量 R 进行比较 - perform Fisher test comparing multiple dataframe columns to the same vector R 通过忽略 r 中的特定字符对 dataframe 的列执行操作 - Perform the operation on the columns of dataframe by ignoring specific characters in r 如何在具有多个 names_from 列和多个 values_from 列的数据帧上执行 pivot_wider? - How to perform pivot_wider on a dataframe with multiple names_from columns and multiple values_from columns? 如何使用日期/时间序列计算 dataframe 中多列的汇总统计信息? - How to calculate summary stats over multiple columns in a dataframe with date/time series? 分组数据框中的时间序列 - Time series in grouped dataframe
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM