使用变量名R在数据框中重命名列

Question

I have a number of data frames. 我有许多数据帧。 Each with the same format. 每个具有相同的格式。 Like this: 像这样：

           A           B          C
1  -0.02299388  0.71404158  0.8492423
2  -1.43027866 -1.96420767 -1.2886368
3  -1.01827712 -0.94141194 -2.0234436

I would like to change the name of the third column--C--so that it includes part if the name of the variable name associated with the data frame. 我想更改第三列的名称-C-以使其包含与数据框关联的变量名称的名称。

For the variable df_elephant the data frame should look like this: 对于变量df_elephant ，数据帧应如下所示：

     A           B          C.elephant
1  -0.02299388  0.71404158  0.8492423
2  -1.43027866 -1.96420767 -1.2886368
3  -1.01827712 -0.94141194 -2.0234436

I have a function which will change the column name: 我有一个函数，它将更改列名称：

rename_columns <- function(x) {

  colnames(x)[colnames(x)=='C'] <-
    paste( 'C',
           strsplit (deparse (substitute(x)), '_')[[1]][2], sep='.' ) 
  return(x)
}

This works with my data frames. 这适用于我的数据框。 However, I would like to provide a list of data frames so that I do not have to call the function multiple times by hand. 但是，我想提供一个数据帧列表，这样我就不必手动调用该函数。 If I use lapply like so: 如果我像这样使用lapply ：

lapply( list (df_elephant, df_horse), rename_columns )

The function renames the data frames with an NA rather than portion of the variable name. 该函数使用NA而不是变量名的一部分来重命名数据帧。

[[1]]
         A            B       C.NA
1  -0.02299388  0.71404158  0.8492423
2  -1.43027866 -1.96420767 -1.2886368
3  -1.01827712 -0.94141194 -2.02344361

[[2]]
         A            B       C.NA
1   0.45387054  0.02279488  1.6746280
2  -1.47271378  0.68660595 -0.2505752
3   1.26475917 -1.51739927 -1.3050531

Is there some way that I kind provide a list of data frames to my function and produce the desired result? 我是否可以通过某种方式为函数提供数据帧列表并产生所需的结果？

Answer 1

You are trying to process the data frame column names instead of the actual lists' name. 您正在尝试处理数据框列名，而不是实际列表名。 And this is why it's not working. 这就是为什么它不起作用。

# Generating random data
n = 3
item1 = data.frame(A = runif(n), B = runif(n), C = runif(n))
item2 = data.frame(A = runif(n), B = runif(n), C = runif(n))
myList = list(df_elephant = item1,  df_horse = item2)


# 1- Why your code doesnt work: ---------------
names(myList) # This will return the actual names that you want to use : [1] "df_elephant" "df_horse"   
lapply(myList, names) # This will return the dataframes' column names. And thats why you are getting the "NA"


# 2- How to make it work: ---------------
lapply(seq_along(myList), # This will return an array of indicies  

       function(i){
         dfName = names(myList)[i] # Get the list name
         dfName.animal = unlist(strsplit(dfName, "_"))[2] # Split on underscore and take the second element

         df = myList[[i]] # Copy the actual Data frame 
         colnames(df)[colnames(df) == "C"] = paste("C", dfName.animal, sep = ".") # Change column names

         return(df) # Return the new df 
       })


# [[1]]
# A          B C.elephant
# 1 0.8289368 0.06589051  0.2929881
# 2 0.2362753 0.55689663  0.4854670
# 3 0.7264990 0.68069346  0.2940342
# 
# [[2]]
# A         B   C.horse
# 1 0.08032856 0.4137106 0.6378605
# 2 0.35671556 0.8112511 0.4321704
# 3 0.07306260 0.6850093 0.2510791

Answer 2

We can try with Map . 我们可以尝试使用Map 。 Get the datasets in a list (here we used mget to return the values of the strings in a list ), using Map , we change the names of the third column with that of the corresponding vector of names . 获取list的数据集（此处使用mget返回list字符串的值），使用Map ，将第三列的names更改为相应的names vector 。

 Map(function(x, y) {names(x)[3] <- paste(names(x)[3], sub(".*_", "", y), sep="."); x},  
     mget(c("df_elephant", "df_horse")), c("df_elephant", "df_horse"))
#$df_elephant
#            A          B  C.elephant
#1 -0.02299388  0.7140416   0.8492423
#2 -1.43027866 -1.9642077  -1.2886368
#3 -1.01827712 -0.9414119  -2.0234436

#$df_horse
#           A           B   C.horse
#1  0.4538705  0.02279488  1.6746280
#2 -1.4727138  0.68660595 -0.2505752
#3  1.2647592 -1.51739927 -1.3050531

Answer 3

You can also try. 您也可以尝试。 Somehow similar to Akrun's answer using also Map in the end: 某种程度上类似于最后使用Map Akrun的答案：

# Your data
d <- read.table("clipboard")
# create a list with names A and B
d_list <- list(A=d, B=d)

# function
foo <- function(x, y){
  gr <- which(colnames(x) == "C") # get index of colnames C 
  tmp <- colnames(x) #new colnames vector
  tmp[gr] <- paste(tmp[gr], y, sep=".") # replace the old with the new colnames.
  setNames(x, tmp) # set the new names
}
# Result
Map(foo, d_list, names(d_list))
$A
            A          B        C.A
1 -0.02299388  0.7140416  0.8492423
2 -1.43027866 -1.9642077 -1.2886368
3 -1.01827712 -0.9414119 -2.0234436

$B
            A          B        C.B
1 -0.02299388  0.7140416  0.8492423
2 -1.43027866 -1.9642077 -1.2886368
3 -1.01827712 -0.9414119 -2.0234436

使用变量名R在数据框中重命名列

问题描述

3 个解决方案

解决方案1
2 已采纳 2016-07-11 13:07:34

解决方案2
1 2016-07-11 12:41:13

解决方案3
1 2016-07-11 12:53:33

使用变量名R在数据框中重命名列

问题描述

3 个解决方案

解决方案1 2 已采纳 2016-07-11 13:07:34

解决方案2 1 2016-07-11 12:41:13

解决方案3 1 2016-07-11 12:53:33

解决方案1
2 已采纳 2016-07-11 13:07:34

解决方案2
1 2016-07-11 12:41:13

解决方案3
1 2016-07-11 12:53:33