[英]How do I convert numbers (in char) to numbers (in numeric) using tidyverse
[英]How do I drop numbers from the start of column names? (preferably through tidyverse)
我正在完成一項任務,我需要綁定一些調查數據集,但不幸的是,調查問題的編號不一致(措辭一致)。 為了解決這個問題,我想從每個問題的開頭刪除問題編號。
目前,我正在使用rename()
手動執行此操作,但對每個數據集中的每個問題重復進行操作非常耗時。 以更快、更有效的方式執行此操作的任何提示?
這是一個示例數據集和我當前的流程:
df1 <- data.frame(ID = c(1, 2, 3, 4, 5),
`1. First Question` = c('a', 'b', 'c', 'd', 'e'),
`2. Second Question` = c(1, 1, 3, 0, 1),
`3. Third Question` = c(1, 2, 0, 2, 1),
Year = 2021) %>%
rename(`First Question` = `1. First Question`,
`Second Question` = `2. Second Question`,
`Third Question` = `3. Third Question`)
df2 <- data.frame(ID = c(1, 2, 3, 4, 5),
`1. First Question` = c('a', 'b', 'c', 'd', 'e'),
`2. Third Question` = c(2, 1, 3, 1, 2),
`3. Second Question` = c(2, 2, 1, 3, 2),
Year = 2022) %>%
rename(`First Question` = `1. First Question`,
`Second Question` = `3. Second Question`,
`Third Question` = `2. Third Question`)
end_df <- rbind(df1, df2)
您可以使用rename_with
,它使用 function ,這里是sub
,根據正則表達式模式更改列名:
df1 %>%
rename_with(~ sub("^X\\d\\.\\.", "", .))
ID First.Question Second.Question Third.Question Year
1 1 a 1 1 2021
2 2 b 1 2 2021
3 3 c 3 0 2021
4 4 d 0 2 2021
5 5 e 1 1 2021
正如@zephryl 所指出的,您可以在一個 go 中完成所有數據幀的list
:
list(df1, df2) %>%
map(rename_with, ~ sub("^X\\d\\.\\.", "", .))
數據:
df1 <- data.frame(ID = c(1, 2, 3, 4, 5),
`1. First Question` = c('a', 'b', 'c', 'd', 'e'),
`2. Second Question` = c(1, 1, 3, 0, 1),
`3. Third Question` = c(1, 2, 0, 2, 1),
Year = 2021)
df2 <- data.frame(ID = c(1, 2, 3, 4, 5),
`1. First Question` = c('a', 'b', 'c', 'd', 'e'),
`2. Third Question` = c(2, 1, 3, 1, 2),
`3. Second Question` = c(2, 2, 1, 3, 2),
Year = 2022)
在dplyr::rename_with()
中為每個 dataframe 使用帶有stringr::str_remove()
的正則表達式:
library(purrr)
library(dplyr)
library(stringr)
list(df1, df2) %>%
map(rename_with, ~ str_remove(.x, "^\\d\\.\\s")) %>%
bind_rows()
# A tibble: 10 × 5
ID `First Question` `Second Question` `Third Question` Year
<dbl> <chr> <dbl> <dbl> <dbl>
1 1 a 1 1 2021
2 2 b 1 2 2021
3 3 c 3 0 2021
4 4 d 0 2 2021
5 5 e 1 1 2021
6 1 a 2 2 2022
7 2 b 2 1 2022
8 3 c 1 3 2022
9 4 d 3 1 2022
10 5 e 2 2 2022
基地 R替代
colnames(df1)[2:4] <- sub("^[0-9]\\. ", "", colnames(df1)[2:4])
colnames(df2)[2:4] <- sub("^[0-9]\\. ", "", colnames(df2)[2:4])
rbind(df1, df2)
ID First Question Second Question Third Question Year
1 1 a 1 1 2021
2 2 b 1 2 2021
3 3 c 3 0 2021
4 4 d 0 2 2021
5 5 e 1 1 2021
6 1 a 2 2 2022
7 2 b 2 1 2022
8 3 c 1 3 2022
9 4 d 3 1 2022
10 5 e 2 2 2022
重要的旁注。 使用check.names = F
創建數據框,否則名稱將替換為類似X1..First.Question
等的內容。
df1 <- data.frame(ID = c(1, 2, 3, 4, 5),
`1. First Question` = c('a', 'b', 'c', 'd', 'e'),
`2. Second Question` = c(1, 1, 3, 0, 1),
`3. Third Question` = c(1, 2, 0, 2, 1),
Year = 2021, check.names = F)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.