简体   繁体   English

read.table 报错,如何设置column name 为row.name?

[英]Error in read.table, how to set column name as row.name?

Can anyone explain what's going on here?谁能解释这里发生了什么? setting row.name= NULL makes no difference compared to when I dont specify it, yet when I set row.names=1 , it says duplicate row.names not allowed?设置row.name= NULL与我不指定它时没有区别,但是当我设置row.names=1时,它说不允许重复 row.names? How do I resolve this to get column V1 as rownames?如何解决此问题以将 V1 列作为行名?

ak1a = read.table("/Users/abhaykanodia/Desktop/smallRNA/AK1a_counts.txt", row.names = NULL)
head(ak1a)
                  V1 V2
1 ENSG00000000003.15  2
2  ENSG00000000005.6  0
3 ENSG00000000419.14 21
4 ENSG00000000457.14  0
5 ENSG00000000460.17  2
6 ENSG00000000938.13  0
ak1a = read.table("/Users/abhaykanodia/Desktop/smallRNA/AK1a_counts.txt")
head(ak1a)
                  V1 V2
1 ENSG00000000003.15  2
2  ENSG00000000005.6  0
3 ENSG00000000419.14 21
4 ENSG00000000457.14  0
5 ENSG00000000460.17  2
6 ENSG00000000938.13  0
ak1a = read.table("/Users/abhaykanodia/Desktop/smallRNA/AK1a_counts.txt", row.names = 1)
Error in read.table("/Users/abhaykanodia/Desktop/smallRNA/AK1a_counts.txt",  : 
  duplicate 'row.names' are not allowed

From the helpfile you can read:从帮助文件中,您可以阅读:

If there is a header and the first row contains one fewer field than the number of columns, the first column in the input is used for the row names.如果有 header 并且第一行包含的字段比列数少一个,则输入中的第一列用于行名称。 Otherwise if row.names is missing, the rows are numbered .否则,如果row.names 缺失,行将被编号

That explains the same behavior when you set row.names=NULL or when you use its default value.这解释了设置 row.names=NULL 或使用其默认值时的相同行为。

You can set row.names as in this example:您可以设置 row.names 在此示例中:

df <- read.table(text="V1 V2
ENSG00000000003.15  2
ENSG00000000005.6  0
ENSG00000000419.14 21
ENSG00000000457.14  0
ENSG00000000460.17  2
ENSG00000000938.13  0", header=TRUE, row.names=letters[1:6])

which displays:显示:

                  V1 V2
a ENSG00000000003.15  2
b  ENSG00000000005.6  0
c ENSG00000000419.14 21
d ENSG00000000457.14  0
e ENSG00000000460.17  2
f ENSG00000000938.13  0

The first two executions are functionally the same, when you don't use row.names parameter of read.table, it's assumed that its value is NULL.前两次执行在功能上是一样的,当你不使用read.table的row.names参数时,假设它的值为NULL。

The third one fails because 1 is interpreted as being a vector with length equal to the number of rows filled with the value 1. Hence the error affirming you can't have two rows with the same name.第三个失败,因为1被解释为长度等于用值 1 填充的行数的向量。因此错误确认你不能有两行具有相同的名称。

What you're doing with row.names=1 is equivalent trying to do:您对row.names=1所做的操作等同于尝试执行以下操作:

test <- read.table(text="X Y
1 2
3 4", header=TRUE)
row.names(test) = c(1,1)

It gives the same Error.它给出了相同的错误。

If you want to name your rows R1:RX why not try something like this:如果你想命名你的行R1:RX为什么不尝试这样的事情:

ak1a = read.table("/Users/abhaykanodia/Desktop/smallRNA/AK1a_counts.txt")
row.names(ak1a) = paste("R",1:dim(ak1a)[1],sep="")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM