[英]How to rename a column under for loop in R?
In my folder there is a bunch of files, which file name is in this patter, 在我的文件夹中,有一堆文件,该文件名在此模式中,
GSM123445_samples_table.txt
GSM129995_samples_table.txt
...
...
GSM129999_samples_table.txt
Inside each file, the table is in this pattern 在每个文件中,表格都采用这种模式
Identifier VALUE
10001 0.12323
10002 0.11535
To create a dataframe that include only those information I want, I am using a list to go through the folder to select the file I want and read the file table. 为了创建仅包含所需信息的数据框,我使用列表浏览文件夹以选择所需文件并读取文件表。
I want my dataframe to look like this 我希望我的数据框看起来像这样
Identifier GSM123445 GSM129995 GSM129999 GSM130095
1 10001 0.12323 0.14523 0.22387 0.56233
2 10002 0.11535 0.39048 0.23437 -0.12323
3 10006 0.12323 0.35634 0.12237 -0.12889
4 10008 0.11535 0.23454 0.21227 0.90098
This is my code 这是我的代码
library(dplyr)
for (file in file_list){
if (!exists("dataset")){ # if dataset not exists, create one
dataset <- read.table(file, header=TRUE, sep="\t") #read txt file from folder
x <- unlist(strsplit(file, "_"))[1] # extract the GSMxxxxxx from the name of files
dataset <- rename(dataset, x = VALUE) # rename the column
}
else {
temp_dataset <- read.table(file, header=TRUE, sep="\t") # read file
x <- unlist(strsplit(file, "_"))[1]
temp_dataset <- rename(temp_dataset, x = VALUE)
dataset<-left_join(dataset, temp_dataset, "Reporter.Identifier")
rm(temp_dataset)
}
}
However, my outcome does not work, and my dataframe look like this 但是,我的结果不起作用,并且数据框如下所示
Identifier x.x x.y x.x x.y
1 10001 0.12323 0.14523 0.22387 0.56233
2 10002 0.11535 0.39048 0.23437 -0.12323
Obviously, the rename part had failed to work. 显然,重命名部分无法正常工作。
How can I solve this problem? 我怎么解决这个问题?
The issue is that rename(dataset, x = VALUE)
uses x
as the column name and not the value of the variable x
. 问题是
rename(dataset, x = VALUE)
使用x
作为列名,而不是变量x
的值。 One way to fix this is to not use rename
and instead concatenate the collection of column names in x
and then set the column names of dataset
at the end using colnames
: 解决此问题的一种方法是不使用
rename
,而是在x
连接列名称的x
,然后在最后使用colnames
设置dataset
集的列名称:
library(dplyr)
x <- "Identifier" ## This will hold all column names
for (file in file_list){
if (!exists("dataset")){ # if dataset not exists, create one
dataset <- read.table(file, header=TRUE, sep="\t") #read txt file from folder
x <- c(x, unlist(strsplit(file, "_"))[1]) # extract the GSMxxxxxx from the name of files can append it to x
}
else {
temp_dataset <- read.table(file, header=TRUE, sep="\t") # read file
x <- c(x, unlist(strsplit(file, "_"))[1])
dataset<-left_join(dataset, temp_dataset, "Reporter.Identifier")
rm(temp_dataset)
}
}
colnames(dataset) <- x
Hope this helps. 希望这可以帮助。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.