简体   繁体   English

重命名R中的多个列

[英]rename the multiple columns in R

I have 1000 files with column similar column names.for example : 我有1000个文件,列名列相似。例如:

df1 DF1

DATE PRICE CLOSE           

df2 DF2

DATE PRICE CLOSE 

and so on... 等等...

If I try to merge them based by date they do get merge but the columns have retained their old names and I want to rename them in a loop 如果我尝试根据日期合并它们,它们会合并,但列保留了它们的旧名称,我想在循环中重命名它们

so merge data set looks like this 所以合并数据集看起来像这样

Date Price Close PRICE CLOSE

I want something like 我想要类似的东西

DATE PRICE1 CLOSE1 PRICE2 CLOSE2.

Is there any easy way to do it? 有没有简单的方法呢? I have tried couple of things which is not giving me correct output 我尝试了几件没有给我正确输出的东西

this is using plyr package: 这是使用plyr包:

mod_join = function(mypath){
  filenames=list.files(path=mypath, full.names=TRUE)
  datalist = lapply(filenames, function(x){read.csv(file=x,header=T)[,c('Date','High','Low')]})
  join_all(datalist,by = "Date")
} 

this is using merge command on all data frame: 这是在所有数据框上使用merge命令:

merge2 = function(mypath){
  filenames=list.files(path=mypath, full.names=TRUE)
  datalist = lapply(filenames, function(x){read.csv(file=x,header=T)[,c('Date','High','Low')]})
  Reduce(function(x,y) {merge(x,y,by.x= "Date",by.y = "Date",all=T)}, datalist)}

}  

I tried using for loop by making the data frame lead then using each data frame to subset and merge subsequently but somehow its not subsetting the dataframes: 我尝试使用for循环,使数据帧处于领先地位,然后使用每个数据帧进行子集化和随后合并,但不知何故,它不对数据帧进行子集化:

for (i in 1:1000){
  data_subset <- sprintf('data_%d',i)
  mydata_subset <- data.frame(,data_subset["Date"],data_subset["High"],data_subset["DayLow"])
  obj_name <- paste('subset_Pricedata',i,sep ="_")
  assign(obj_name,value = mydata_subset)
} 

Any help will be great. 任何帮助都会很棒。 Thanks 谢谢

Hopefully, this will do your job: 希望这能完成你的工作:

library(plyr)
df1 = rename(df1,c("PRICE"="PRICE1","CLOSE"="CLOSE1"))
df2 = rename(df2,c("PRICE"="PRICE2","CLOSE"="CLOSE2"))
new = merge(df1,df2,all=TRUE)

Please comment if you face any difficulties. 如果您遇到任何困难,请评论。

What about this approach? 这种方法怎么样? It should be fast as it uses data.table and its fread-function 它应该很快,因为它使用data.table及其fread-function

library(data.table)
merge2 <- function(mypath){
     filenames <- list.files(path=mypath, full.names=TRUE)
     fileslist <- lapply(filenames, function(nam){
            # reads the file
            file <- fread(nam)
            setnames(file, 2, "price") # renames the second col to "price"
            setnames(file, 3, "close") # third to "close"
            return(file)
     })
     dat <- rbindlist(fileslist)
     return(dat)
}  

EDIT 编辑

I just realised that you want to merge your data instead of having it in the long format. 我刚刚意识到你要合并你的数据而不是长格式。 What you can do is just add a variable with a name to the data.table "file" before returning the file by adding: 你可以做的只是在返回文件之前添加一个带有名称的变量给data.table“file”:

file[, varnam := nam]

and then cast the final data.table "dat" before returning it, using the reshape2 library and its dcast function. 然后使用reshape2库及其dcast函数在返回之前转换最终的data.table“dat”。

I had a similar problem. 我遇到了类似的问题。 Here's what I ended up using, although there is likely a cleaner way. 这是我最终使用的内容,尽管可能有更简洁的方法。

The function suffix_col_names will add a suffix to a subset of columns. 函数suffix_col_names将为列的子集添加后缀。 I use this because I eventually merge week1 and week 2 data on columns 1-10. 我使用这个是因为我最终在第1-10列合并了第1周和第2周的数据。

#function called suffix_col_names
suffix_col_names<-function(your_df, start_col, end_col, your_str, your_sep){
  for (i in start_col:end_col){
    colnames(your_df)[i]<-paste(colnames(your_df)[i], sep=your_sep,your_str)
  }
  return(your_df)
}
#call function to rename columns in week1 and week2
week_1_data<-suffix_col_names(week1,11,24,"1",".")
week_2_data<-suffix_col_names(week2,11,24,"2",".")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM