简体   繁体   English

如何根据 R 中的字典在多个数据框中重命名具有不同列名和不同顺序的多个列

[英]How to rename multiple columns with different column names and different order in several dataframes based on a dictionary in R

I am working on merging multiple datasets from different sources.我正在合并来自不同来源的多个数据集。 The column names from each dataset (as datframes) have different names and are in different orders.每个数据集的列名(作为 datframes)具有不同的名称和不同的顺序。 I have created a dictionary that contains all the different names and the common name I want to rename the original names with.我创建了一个字典,其中包含所有不同的名称和我想要重命名原始名称的通用名称。 How do I rename the original column names using the dictionary in R?如何使用 R 中的字典重命名原始列名? I specifically want to use a dictionary because I may add more datasets (with different column names) in the future and it would be easy to adapt the dictionary.我特别想使用字典,因为将来我可能会添加更多数据集(具有不同的列名),并且很容易适应字典。

I know I can manually rename every column but there are many (like 30) and they may change with the addition of new datasets.我知道我可以手动重命名每一列,但是有很多(比如 30 个),并且它们可能会随着新数据集的添加而改变。

df1 <- data.frame(site = c(1:6), code = c(rep("A",3), rep("B", 3)), result = c(20:25))
df2 <- data.frame(site_no = c(10:19), day = c(1:10), test = c(rep("A", 5), rep("B", 5)), value = c(1:10))
dict <- data.frame(oldName = c("site", "code", "result", "site_no", "day", "test", "value"),  newName = c("site_number", "parameter", "result", "site_number", "day", "parameter", "result"))

I would like to rename the columns in df1 and df2 based on the dict dataframe, which contains the old names (all the column names from df1 and df2) and the new names (the common names to use).我想根据字典 dataframe 重命名 df1 和 df2 中的列,其中包含旧名称(来自 df1 和 df2 的所有列名)和新名称(要使用的常用名称)。

The result would be:结果将是:

colnames(df1)
"site_number" "parameter" "result"

colnames(df2)
"site_number" "day" "parameter" "result"

We can match the names of the respective df to the oldname, then extract the newname at the matched indices:我们可以将各个df的名称与旧名称match ,然后在匹配的索引处提取新名称:

names(df1) = with(dict,newName[match(names(df1),oldName)])
names(df2) = with(dict,newName[match(names(df2),oldName)])
print(df1)
print(df2)

We can use rename_all after placing the datasets in a list .我们可以在将数据集放入list后使用rename_all It is better to have those datasets in a list instead of having them in the global environment最好将这些数据集放在list ,而不是将它们放在全局环境中

library(dplyr)
library(purrr)
out <- mget(ls(pattern = "^df\\d+$")) %>%
       map(~ .x %>% 
         rename_all(~  as.character(dict$newName)[match(., dict$oldName)]))

If we want, we can can change the column names in the original object with list2env如果需要,我们可以使用list2env更改原始 object 中的列名

list2env(out, .GlobalEnv)
names(df1)
#[1] "site_number" "parameter"   "result"     

names(df2)
#[1] "site_number" "day"         "parameter"   "result"     

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM