简体   繁体   English

R 使用 rename_with() 重命名带有通配符的多个列

[英]R rename multiple columns with wildcard with rename_with()

library(tidyverse)图书馆(整洁的宇宙)

I am looking to rename bunch columns and I tried to use rename_at() or rename_with() in R but not much success, can someone help?我想重命名一堆列,我尝试在 R 中使用 rename_at() 或 rename_with() 但收效甚微,有人可以帮忙吗? Thank you very much for your help.非常感谢您的帮助。

Original data frame column names原始数据框列名

tibble(
AAA_BBB1_P1_Elev = as.double(),
AAA_BBB2_P2_Elev = as.double(),
AAA_BBB2_P3_Elev = as.double()
)

Want to change the column names to想要将列名更改为

tibble(
`BBB1-P1E` = as.double(),
`BBB1-P2E` = as.double(),
`BBB1-P3E` = as.double()
)

We can use rename_all with str_replace我们可以使用rename_allstr_replace

library(dplyr)
library(stringr)
tbl2 <- tbl1 %>%
     rename_all(~ str_replace_all(str_replace(., '^[^_]+_(.*)_(.)[^.]+$', "\\1\\2"), '_', "-"))

-output -输出

tbl2
# A tibble: 0 x 3
# … with 3 variables: `BBB1-P1E` <dbl>, `BBB2-P2E` <dbl>, `BBB2-P3E` <dbl>

data数据

tbl1 <- structure(list(AAA_BBB1_P1_Elev = numeric(0), AAA_BBB2_P2_Elev = numeric(0), 
    AAA_BBB2_P3_Elev = numeric(0)), row.names = integer(0), class = c("tbl_df", 
"tbl", "data.frame"))

Inspired by akrun's answer, I came up with a work-around...受 akrun 回答的启发,我想出了一个变通办法……

remove <- c("AAA_BBB[0-9]_", "lev", "_") 

tibble(
AAA_BBB1_P1_Elev = as.double(), 
AAA_BBB2_P2_Elev = as.double(), 
AAA_BBB2_P3_Elev = as.double()
) %>% 
rename_all(~ str_remove_all(., paste(remove, collapse = "|"))) %>% 
rename_at(vars(ends_with("E")), ~ paste0("BBB1-", .x)) 

# A tibble: 0 x 3 
# ... with 3 variables: BBB1-P1E <dbl>, BBB1-P2E <dbl>, BBB1-P3E <dbl> 

Base R option:基本 R 选项:

names(df) <- sub('\\w+_(\\w+)_(\\w+)_.*', '\\1-\\2E', names(df))
names(df)
#"BBB1-P1E" "BBB2-P2E" "BBB2-P3E"

I had a similar problem.我有一个类似的问题。 Data on people tested on two occasions was given as两次接受测试的人的数据如下

Data %>%
  select(Identifier, contains('eq5d'),-EQ5D3L_Combined) %>%
  names()

which gave this list of names.它给出了这个名单。

"Identifier" "EQ5D3L_Item_1" "EQ5D3L_Item_2" “标识符” “EQ5D3L_Item_1” “EQ5D3L_Item_2”
"EQ5D3L_Item_3" "EQ5D3L_Item_4" "EQ5D3L_Item_5" “EQ5D3L_Item_3” “EQ5D3L_Item_4” “EQ5D3L_Item_5”
"EQ5D3L_VAS" "EQ5D3L_Item_1_2" "EQ5D3L_Item_2_2" "EQ5D3L_Item_3_2" "EQ5D3L_Item_4_2" "EQ5D3L_Item_5_2" "EQ5D3L_VAS_2" “EQ5D3L_VAS” “EQ5D3L_Item_1_2” “EQ5D3L_Item_2_2” “EQ5D3L_Item_3_2” “EQ5D3L_Item_4_2” “EQ5D3L_Item_5_2” “EQ5D3L_VAS_2”

The '_2' was the second measurement occasion, and I needed to put '_1' at the end of the variable from the first occasion. '_2' 是第二次测量事件,我需要将 '_1' 放在第一个事件的变量末尾。

To fix this I used rename_with as follows为了解决这个问题,我使用了 rename_with 如下

Data %>%
   select(Identifier, contains('eq5d'),-EQ5D3L_Combined) %>% 
   rename_with(~ifelse(!str_ends(.x,'[0-9]'),
                       str_c(.x,'_1'),
                       .x)) %>%
    names()

The !str_end picks out variable names which did not end with a number, as a logical vector. !str_end 挑选出以数字结尾的变量名称,作为逻辑向量。 The ifelse applies the rename_with only to those which did not end in a number, and the str_c adds '_1' to those - so I now have my variable named correctly for an easy pivot_longer. ifelse 仅将 rename_with 应用于未以数字结尾的那些,而 str_c 将“_1”添加到那些 - 所以我现在为简单的 pivot_longer 正确命名了我的变量。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM