简体   繁体   English

使用变量名R在数据框中重命名列

[英]rename column in dataframe using variable name R

I have a number of data frames. 我有许多数据帧。 Each with the same format. 每个具有相同的格式。 Like this: 像这样:

           A           B          C
1  -0.02299388  0.71404158  0.8492423
2  -1.43027866 -1.96420767 -1.2886368
3  -1.01827712 -0.94141194 -2.0234436

I would like to change the name of the third column--C--so that it includes part if the name of the variable name associated with the data frame. 我想更改第三列的名称-C-以使其包含与数据框关联的变量名称的名称。

For the variable df_elephant the data frame should look like this: 对于变量df_elephant ,数据帧应如下所示:

     A           B          C.elephant
1  -0.02299388  0.71404158  0.8492423
2  -1.43027866 -1.96420767 -1.2886368
3  -1.01827712 -0.94141194 -2.0234436

I have a function which will change the column name: 我有一个函数,它将更改列名称:

rename_columns <- function(x) {

  colnames(x)[colnames(x)=='C'] <-
    paste( 'C',
           strsplit (deparse (substitute(x)), '_')[[1]][2], sep='.' ) 
  return(x)
}

This works with my data frames. 这适用于我的数据框。 However, I would like to provide a list of data frames so that I do not have to call the function multiple times by hand. 但是,我想提供一个数据帧列表,这样我就不必手动调用该函数。 If I use lapply like so: 如果我像这样使用lapply

lapply( list (df_elephant, df_horse), rename_columns )

The function renames the data frames with an NA rather than portion of the variable name. 该函数使用NA而不是变量名的一部分来重命名数据帧。

[[1]]
         A            B       C.NA
1  -0.02299388  0.71404158  0.8492423
2  -1.43027866 -1.96420767 -1.2886368
3  -1.01827712 -0.94141194 -2.02344361

[[2]]
         A            B       C.NA
1   0.45387054  0.02279488  1.6746280
2  -1.47271378  0.68660595 -0.2505752
3   1.26475917 -1.51739927 -1.3050531

Is there some way that I kind provide a list of data frames to my function and produce the desired result? 我是否可以通过某种方式为函数提供数据帧列表并产生所需的结果?

You are trying to process the data frame column names instead of the actual lists' name. 您正在尝试处理数据框列名,而不是实际列表名。 And this is why it's not working. 这就是为什么它不起作用。

# Generating random data
n = 3
item1 = data.frame(A = runif(n), B = runif(n), C = runif(n))
item2 = data.frame(A = runif(n), B = runif(n), C = runif(n))
myList = list(df_elephant = item1,  df_horse = item2)


# 1- Why your code doesnt work: ---------------
names(myList) # This will return the actual names that you want to use : [1] "df_elephant" "df_horse"   
lapply(myList, names) # This will return the dataframes' column names. And thats why you are getting the "NA"


# 2- How to make it work: ---------------
lapply(seq_along(myList), # This will return an array of indicies  

       function(i){
         dfName = names(myList)[i] # Get the list name
         dfName.animal = unlist(strsplit(dfName, "_"))[2] # Split on underscore and take the second element

         df = myList[[i]] # Copy the actual Data frame 
         colnames(df)[colnames(df) == "C"] = paste("C", dfName.animal, sep = ".") # Change column names

         return(df) # Return the new df 
       })


# [[1]]
# A          B C.elephant
# 1 0.8289368 0.06589051  0.2929881
# 2 0.2362753 0.55689663  0.4854670
# 3 0.7264990 0.68069346  0.2940342
# 
# [[2]]
# A         B   C.horse
# 1 0.08032856 0.4137106 0.6378605
# 2 0.35671556 0.8112511 0.4321704
# 3 0.07306260 0.6850093 0.2510791

We can try with Map . 我们可以尝试使用Map Get the datasets in a list (here we used mget to return the values of the strings in a list ), using Map , we change the names of the third column with that of the corresponding vector of names . 获取list的数据集(此处使用mget返回list字符串的值),使用Map ,将第三列的names更改为相应的names vector

 Map(function(x, y) {names(x)[3] <- paste(names(x)[3], sub(".*_", "", y), sep="."); x},  
     mget(c("df_elephant", "df_horse")), c("df_elephant", "df_horse"))
#$df_elephant
#            A          B  C.elephant
#1 -0.02299388  0.7140416   0.8492423
#2 -1.43027866 -1.9642077  -1.2886368
#3 -1.01827712 -0.9414119  -2.0234436

#$df_horse
#           A           B   C.horse
#1  0.4538705  0.02279488  1.6746280
#2 -1.4727138  0.68660595 -0.2505752
#3  1.2647592 -1.51739927 -1.3050531

You can also try. 您也可以尝试。 Somehow similar to Akrun's answer using also Map in the end: 某种程度上类似于最后使用Map Akrun的答案:

# Your data
d <- read.table("clipboard")
# create a list with names A and B
d_list <- list(A=d, B=d)

# function
foo <- function(x, y){
  gr <- which(colnames(x) == "C") # get index of colnames C 
  tmp <- colnames(x) #new colnames vector
  tmp[gr] <- paste(tmp[gr], y, sep=".") # replace the old with the new colnames.
  setNames(x, tmp) # set the new names
}
# Result
Map(foo, d_list, names(d_list))
$A
            A          B        C.A
1 -0.02299388  0.7140416  0.8492423
2 -1.43027866 -1.9642077 -1.2886368
3 -1.01827712 -0.9414119 -2.0234436

$B
            A          B        C.B
1 -0.02299388  0.7140416  0.8492423
2 -1.43027866 -1.9642077 -1.2886368
3 -1.01827712 -0.9414119 -2.0234436

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM