如何從列名稱的開頭刪除數字？（最好通過 tidyverse）

Question

我正在完成一項任務，我需要綁定一些調查數據集，但不幸的是，調查問題的編號不一致（措辭一致）。 為了解決這個問題，我想從每個問題的開頭刪除問題編號。

目前，我正在使用rename()手動執行此操作，但對每個數據集中的每個問題重復進行操作非常耗時。 以更快、更有效的方式執行此操作的任何提示？

這是一個示例數據集和我當前的流程：

df1 <- data.frame(ID = c(1, 2, 3, 4, 5),
                  `1. First Question`  = c('a', 'b', 'c', 'd', 'e'),
                  `2. Second Question` = c(1, 1, 3, 0, 1),
                  `3. Third Question`  = c(1, 2, 0, 2, 1),
                   Year = 2021) %>%
       rename(`First Question` = `1. First Question`,
              `Second Question` = `2. Second Question`,
              `Third Question` = `3. Third Question`)

df2 <- data.frame(ID = c(1, 2, 3, 4, 5),
                  `1. First Question`  = c('a', 'b', 'c', 'd', 'e'),
                  `2. Third Question`  = c(2, 1, 3, 1, 2),
                  `3. Second Question` = c(2, 2, 1, 3, 2),
                  Year = 2022) %>% 
       rename(`First Question`  = `1. First Question`,
              `Second Question` = `3. Second Question`,
              `Third Question`  = `2. Third Question`)

end_df <- rbind(df1, df2)

Answer 1

您可以使用rename_with ，它使用 function ，這里是sub ，根據正則表達式模式更改列名：

df1 %>%
    rename_with(~ sub("^X\\d\\.\\.", "", .))
  ID First.Question Second.Question Third.Question Year
1  1              a               1              1 2021
2  2              b               1              2 2021
3  3              c               3              0 2021
4  4              d               0              2 2021
5  5              e               1              1 2021

正如@zephryl 所指出的，您可以在一個 go 中完成所有數據幀的list ：

list(df1, df2) %>%
  map(rename_with, ~ sub("^X\\d\\.\\.", "", .))

數據：

df1 <- data.frame(ID = c(1, 2, 3, 4, 5),
                  `1. First Question` = c('a', 'b', 'c', 'd', 'e'),
                  `2. Second Question` = c(1, 1, 3, 0, 1),
                  `3. Third Question` = c(1, 2, 0, 2, 1),
                  Year = 2021)

df2 <- data.frame(ID = c(1, 2, 3, 4, 5),
                  `1. First Question` = c('a', 'b', 'c', 'd', 'e'),
                  `2. Third Question` = c(2, 1, 3, 1, 2),
                  `3. Second Question` = c(2, 2, 1, 3, 2),
                  Year = 2022)

Answer 2

在dplyr::rename_with()中為每個 dataframe 使用帶有stringr::str_remove()的正則表達式：

library(purrr)
library(dplyr)
library(stringr)

list(df1, df2) %>%
  map(rename_with, ~ str_remove(.x, "^\\d\\.\\s")) %>%
  bind_rows()

# A tibble: 10 × 5
      ID `First Question` `Second Question` `Third Question`  Year
   <dbl> <chr>                        <dbl>            <dbl> <dbl>
 1     1 a                                1                1  2021
 2     2 b                                1                2  2021
 3     3 c                                3                0  2021
 4     4 d                                0                2  2021
 5     5 e                                1                1  2021
 6     1 a                                2                2  2022
 7     2 b                                2                1  2022
 8     3 c                                1                3  2022
 9     4 d                                3                1  2022
10     5 e                                2                2  2022

Answer 3

基地 R替代

colnames(df1)[2:4] <- sub("^[0-9]\\. ", "", colnames(df1)[2:4])
colnames(df2)[2:4] <- sub("^[0-9]\\. ", "", colnames(df2)[2:4])

rbind(df1, df2)
   ID First Question Second Question Third Question Year
1   1              a               1              1 2021
2   2              b               1              2 2021
3   3              c               3              0 2021
4   4              d               0              2 2021
5   5              e               1              1 2021
6   1              a               2              2 2022
7   2              b               2              1 2022
8   3              c               1              3 2022
9   4              d               3              1 2022
10  5              e               2              2 2022

重要的旁注。 使用check.names = F創建數據框，否則名稱將替換為類似X1..First.Question等的內容。

df1 <- data.frame(ID = c(1, 2, 3, 4, 5),
                  `1. First Question` = c('a', 'b', 'c', 'd', 'e'),
                  `2. Second Question` = c(1, 1, 3, 0, 1),
                  `3. Third Question` = c(1, 2, 0, 2, 1),
                  Year = 2021, check.names = F)

如何從列名稱的開頭刪除數字？（最好通過 tidyverse）

問題描述

3 個解決方案

解決方案1
3 2022-11-23 12:19:51

解決方案2
2 2022-11-23 12:20:56

解決方案3
2 2022-11-23 12:50:10

如何從列名稱的開頭刪除數字？ （最好通過 tidyverse）

問題描述

3 個解決方案

解決方案1 3 2022-11-23 12:19:51

解決方案2 2 2022-11-23 12:20:56

解決方案3 2 2022-11-23 12:50:10

如何從列名稱的開頭刪除數字？（最好通過 tidyverse）

解決方案1
3 2022-11-23 12:19:51

解決方案2
2 2022-11-23 12:20:56

解決方案3
2 2022-11-23 12:50:10