简体   繁体   English

如何在R中的for循环下重命名列?

[英]How to rename a column under for loop in R?

In my folder there is a bunch of files, which file name is in this patter, 在我的文件夹中,有一堆文件,该文件名在此模式中,

GSM123445_samples_table.txt
GSM129995_samples_table.txt
...
...
GSM129999_samples_table.txt

Inside each file, the table is in this pattern 在每个文件中,表格都采用这种模式

Identifier     VALUE
     10001   0.12323
     10002   0.11535

To create a dataframe that include only those information I want, I am using a list to go through the folder to select the file I want and read the file table. 为了创建仅包含所需信息的数据框,我使用列表浏览文件夹以选择所需文件并读取文件表。

I want my dataframe to look like this 我希望我的数据框看起来像这样

     Identifier  GSM123445  GSM129995  GSM129999  GSM130095
 1       10001     0.12323    0.14523    0.22387    0.56233
 2       10002     0.11535    0.39048    0.23437   -0.12323
 3       10006     0.12323    0.35634    0.12237   -0.12889
 4       10008     0.11535    0.23454    0.21227    0.90098

This is my code 这是我的代码

library(dplyr)
for (file in file_list){
  if (!exists("dataset")){     # if dataset not exists, create one
     dataset <- read.table(file, header=TRUE, sep="\t") #read txt file from folder
     x <- unlist(strsplit(file, "_"))[1] # extract the GSMxxxxxx from the name of files
     dataset <- rename(dataset, x = VALUE) # rename the column
  }     
  else {
     temp_dataset <- read.table(file, header=TRUE, sep="\t") # read file
     x <- unlist(strsplit(file, "_"))[1]
     temp_dataset <- rename(temp_dataset, x = VALUE)    
     dataset<-left_join(dataset, temp_dataset, "Reporter.Identifier")
     rm(temp_dataset)
  }
}

However, my outcome does not work, and my dataframe look like this 但是,我的结果不起作用,并且数据框如下所示

     Identifier        x.x        x.y        x.x        x.y
 1       10001     0.12323    0.14523    0.22387    0.56233
 2       10002     0.11535    0.39048    0.23437   -0.12323

Obviously, the rename part had failed to work. 显然,重命名部分无法正常工作。

How can I solve this problem? 我怎么解决这个问题?

The issue is that rename(dataset, x = VALUE) uses x as the column name and not the value of the variable x . 问题是rename(dataset, x = VALUE)使用x作为列名,而不是变量x的值。 One way to fix this is to not use rename and instead concatenate the collection of column names in x and then set the column names of dataset at the end using colnames : 解决此问题的一种方法是不使用rename ,而是在x连接列名称的x ,然后在最后使用colnames设置dataset集的列名称:

library(dplyr)
x <- "Identifier"  ## This will hold all column names
for (file in file_list){
  if (!exists("dataset")){     # if dataset not exists, create one
     dataset <- read.table(file, header=TRUE, sep="\t") #read txt file from folder
     x <- c(x, unlist(strsplit(file, "_"))[1]) # extract the GSMxxxxxx from the name of files can append it to x
  }     
  else {
     temp_dataset <- read.table(file, header=TRUE, sep="\t") # read file
     x <- c(x, unlist(strsplit(file, "_"))[1])
     dataset<-left_join(dataset, temp_dataset, "Reporter.Identifier")
     rm(temp_dataset)
  }
}
colnames(dataset) <- x

Hope this helps. 希望这可以帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM