简体   繁体   中英

In R, change select column names within select data frames in a list

I have a list containing 21 very large data frames. For 11 of these data frames I'd like to change the names of the last 5 columns.

Here is some example code that shows the same basic structure as my data.

x<-data.frame(matrix(data=rep("2",12),ncol=6))
y<-data.frame(matrix(data=rep("3",12),ncol=6))
z<-data.frame(matrix(data=rep("4",12),ncol=6))
a<-list(x,y,z)

> a
[[1]]
  X1 X2 X3 X4 X5 X6
1  2  2  2  2  2  2
2  2  2  2  2  2  2

[[2]]
  X1 X2 X3 X4 X5 X6
1  3  3  3  3  3  3
2  3  3  3  3  3  3

[[3]]
  X1 X2 X3 X4 X5 X6
1  4  4  4  4  4  4
2  4  4  4  4  4  4

This is the output that I want:

>a
[[1]]
  X1 Column2 Column3 Column4 Column5 Column6
1  2       2       2       2       2       2
2  2       2       2       2       2       2

[[2]]
  X1 Column2 Column3 Column4 Column5 Column6
1  3       3       3       3       3       3
2  3       3       3       3       3       3

[[3]]
  X1 X2 X3 X4 X5 X6
1  4  4  4  4  4  4
2  4  4  4  4  4  4

Currently this is my unsatisfactory method:

x<-data.frame(matrix(data=rep("2",12),ncol=6))
y<-data.frame(matrix(data=rep("3",12),ncol=6))
z<-data.frame(matrix(data=rep("4",12),ncol=6))
a<-list(x,y,z)

data_frames_to_change<-c("x","y")
library("data.table")

for (i in 1:length(data_frames_to_change)){
setnames(eval(as.name(data_frames_to_change[i])),colnames(eval(as.name(data_frames_to_change[i]))[2:6]),c("Column2","Column3","Column4","Column5","Column6"))
}

a<-list(x,y,z)

I know that this code is not only bad because it uses a loop instead of an apply (I'm still very much an apply novice). Also it is extremely slow, even on the tiny example data.

I found this while searching: Apply a function to each data frame . How does one apply to a subset of the dataframes?

I would think that a good answer would find of way of applying to a subset of data frames in the list of data frames a function that changes the last five column names. In doing so it wouldn't have to read through the massive list more than once.

A couple more things, I don't know the most effective way to convert a character string to variable name (data frame name) in this context. Should I use something other than eval(as.name())? I'm using R 3.03.

Thanks for your help.

Try this:

a[1:2] <- lapply(a[1:2], function(thisdf) {
    names(thisdf)[(length(thisdf)-4):length(thisdf)] <- paste0('Column',2:6)
    thisdf
})

Basically, use lapply like you're using a for -loop, changing the names of the specified columns of the specified dataframes. This produces a list of dataframes, which you can then store back into your original list.

The result:

> a
[[1]]
  X1 Column2 Column3 Column4 Column5 Column6
1  2       2       2       2       2       2
2  2       2       2       2       2       2

[[2]]
  X1 Column2 Column3 Column4 Column5 Column6
1  3       3       3       3       3       3
2  3       3       3       3       3       3

[[3]]
  X1 X2 X3 X4 X5 X6
1  4  4  4  4  4  4
2  4  4  4  4  4  4

或者只使用colnames

colnames(a[[1]])<- c("X1","col2","col3","col4","col5","col6")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM