简体   繁体   English

重命名多个数据帧中的列,R

[英]Rename columns in multiple dataframes, R

I am trying to rename columns of multiple data.frame s. 我正在尝试重命名多个data.frame的列。

To give an example, let's say I've a list of data.frame s dfA , dfB and dfC . 举个例子,假设我有一个data.frame s dfAdfBdfC I wrote a function changeNames to set names accordingly and then used lapply as follows: 我写了一个函数changeNames来相应地设置名称然后使用lapply ,如下所示:

dfs <- list(dfA, dfB, dfC)
ChangeNames <- function(x) {
    names(x) <- c("A", "B", "C" )  
}
lapply(dfs, ChangeNames)

However, this doesn't work as expected. 但是,这不能按预期工作。 It seems that I am not assigning the new names to the data.frame , rather only creating the new names. 似乎我没有将新名称分配给data.frame ,而只是创建新名称。 What am I doing wrong here? 我在这做错了什么?

Thank you in advance! 先感谢您!

There are two things here: 这里有两件事:

  • 1) You should return the value you want from your function. 1)您应该从函数中返回所需的值。 Else, the last value will be returned. 否则,将返回最后一个值。 In your case, that's names(x) . 在你的情况下,这是names(x) So, instead you should add as the final line, return(x) or simply x . 所以,你应该添加最后一行, return(x)或简单地x So, your function would look like: 所以,你的功能看起来像:

     ChangeNames <- function(x) { names(x) <- c("A", "B", "C" ) return(x) } 
  • 2) lapply does not modify your input objects by reference. 2) lapply不会通过引用修改输入对象。 It works on a copy. 它适用于副本。 So, you'll have to assign the results back. 因此,您必须重新分配结果。 Or another alternative is to use for-loops instead of lapply : 或者另一种方法是使用for-loops而不是lapply

     # option 1 dfs <- lapply(dfs, ChangeNames) # option 2 for (i in seq_along(dfs)) { names(dfs[[i]]) <- c("A", "B", "C") } 

Even using the for-loop , you'll still make a copy (because names(.) <- . does). 即使使用for-loop ,你仍然会复制(因为names(.) <- . )。 You can verify this by using tracemem . 您可以使用tracemem验证这tracemem

df <- data.frame(x=1:5, y=6:10, z=11:15)
tracemem(df)
# [1] "<0x7f98ec24a480>"
names(df) <- c("A", "B", "C")
tracemem(df)
# [1] "<0x7f98e7f9e318>"

If you want to modify by reference, you can use data.table package's setnames function: 如果要通过引用进行修改,可以使用data.table包的setnames函数:

df <- data.frame(x=1:5, y=6:10, z=11:15)
require(data.table)
tracemem(df)
# [1] "<0x7f98ec76d7b0>"
setnames(df, c("A", "B", "C"))
tracemem(df)
# [1] "<0x7f98ec76d7b0>"

You see that the memory location df is mapped to hasn't changed. 您看到映射到的内存位置df未更改。 The names have been modified by reference. 名称已通过参考修改。

If the dataframes were not in a list but just in the global environment, you could refer to them using a vector of string names. 如果数据帧不在列表中但仅在全局环境中,则可以使用字符串名称向量引用它们。

dfs <- c("dfA", "dfB", "dfC")

for(df in dfs) {
  df.tmp <- get(df)
  names(df.tmp) <- c("A", "B", "C" ) 
  assign(df, df.tmp)
}

EDIT 编辑

To simplify the above code you could use 为简化上述代码,您可以使用

for(df in dfs)
  assign(df, setNames(get(df),  c("A", "B", "C")))

or using data.table which doesn't require reassigning. 或使用不需要重新分配的data.table

for(df in c("dfA", "dfB"))
  data.table::setnames(get(df),  c("G", "H"))

I had the problem of importing a public data set and having to rename each dataframe and rename each column in each dataframe to trim whitespaces, lowercase, and replace internal spaces with periods. 我遇到了导入公共数据集并且必须重命名每个数据帧并重命名每个数据帧中的每一列以修剪空格,小写和用句点替换内部空格的问题。

Combining the above methods got me: 结合上述方法让我:

for (eachdf in dfs)
  df.tmp <- get(eachdf) 
    for (eachcol in 1:length(df.tmp))
      colnames(df.tmp)[eachcol] <-
      str_trim(str_to_lower(str_replace_all(colnames(df.tmp)[eachcol], " ", ".")))
      }
  assign(eachdf, df.tmp) 
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 R Plyr重命名数据帧列表中的多个列 - R Plyr Rename multiple columns in list of dataframes 重命名R中的多个列 - rename the multiple columns in R 如何根据 R 中的字典在多个数据框中重命名具有不同列名和不同顺序的多个列 - How to rename multiple columns with different column names and different order in several dataframes based on a dictionary in R 在R中转置并合并具有缺失数据和空白列名称的多个数据帧/在dcast之前重命名融化的列 - In R transpose and combine multiple dataframes with missing data and blank column names / rename melted columns prior to dcast R 使用 rename_with() 重命名带有通配符的多个列 - R rename multiple columns with wildcard with rename_with() 如何重命名R中的多个列? - How to rename multiple Columns in R? 重命名数据框列表中的列 - rename columns in a list of dataframes 如何在我的数据帧上使用完整连接并重命名具有相同名称的列 R - How to use a fulljoin on my dataframes and rename columns with the same name R 他们是一种在 R 中重命名我的工作区中所有数据框列的方法吗 - Is their a way to rename the columns of all the dataframes in my workspace in R 如何重命名 R 中不同数据框中不同列中的观察值? - How to rename observations in different columns in different dataframes in R?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM