简体   繁体   English

R中的行名和列名

[英]Row names & column names in R

Do the following function pairs generate exactly the same results? 以下函数对是否生成完全相同的结果?

Pair 1) names() & colnames() 对1) names()colnames()

Pair 2) rownames() & row.names() 对2) rownames() row.names()

As Oscar Wilde said 正如奥斯卡王尔德所说

Consistency is the last refuge of the unimaginative. 一致性是缺乏想象力的最后避难所。

R is more of an evolved rather than designed language, so these things happen. R更像是一种进化而非设计的语言,所以这些事情发生了。 names() and colnames() work on a data.frame but names() does not work on a matrix: names()colnames()适用于data.framenames()不适用于矩阵:

R> DF <- data.frame(foo=1:3, bar=LETTERS[1:3])
R> names(DF)
[1] "foo" "bar"
R> colnames(DF)
[1] "foo" "bar"
R> M <- matrix(1:9, ncol=3, dimnames=list(1:3, c("alpha","beta","gamma")))
R> names(M)
NULL
R> colnames(M)
[1] "alpha" "beta"  "gamma"
R> 

Just to expand a little on Dirk's example: 只是为了扩展Dirk的例子:

It helps to think of a data frame as a list with equal length vectors. 将数据帧视为具有相等长度向量的列表是有帮助的。 That's probably why names works with a data frame but not a matrix. 这可能就是为什么names适用于数据框但不适用于矩阵的原因。

The other useful function is dimnames which returns the names for every dimension. 另一个有用的函数是dimnames ,它返回每个维度的名称。 You will notice that the rownames function actually just returns the first element from dimnames . 你会发现, rownames函数实际上只是返回从第一个元素dimnames

Regarding rownames and row.names : I can't tell the difference, although rownames uses dimnames while row.names was written outside of R. They both also seem to work with higher dimensional arrays: 关于rownamesrow.names :我无法分辨出来,虽然rownames使用dimnamesrow.names写R.之外他们都似乎也具有较高的二维数组的工作:

>a <- array(1:5, 1:4)
> a[1,,,]
> rownames(a) <- "a"
> row.names(a)
[1] "a"
> a
, , 1, 1    
  [,1] [,2]
a    1    2

> dimnames(a)
[[1]]
[1] "a"

[[2]]
NULL

[[3]]
NULL

[[4]]
NULL

I think that using colnames and rownames makes the most sense; 我觉得用colnamesrownames是很有道理的; here's why. 这就是原因。

Using names has several disadvantages. 使用names有几个缺点。 You have to remember that it means "column names", and it only works with data frame, so you'll need to call colnames whenever you use matrices. 你必须记住它意味着“列名”,它只适用于数据框,所以你需要在使用矩阵时调用colnames By calling colnames , you only have to remember one function. 通过调用colnames ,您只需要记住一个函数。 Finally, if you look at the code for colnames , you will see that it calls names in the case of a data frame anyway, so the output is identical. 最后,如果你看一下colnames的代码,你会看到它无论如何都会在数据框的情况下调用names ,因此输出是相同的。

rownames and row.names return the same values for data frame and matrices; rownamesrow.names返回数据帧和矩阵相同的值; the only difference that I have spotted is that where there aren't any names, rownames will print "NULL" (as does colnames ), but row.names returns it invisibly. 我发现的唯一区别是,如果没有任何名称, rownames将打印“NULL”(与colnames ),但row.names将无形地返回。 Since there isn't much to choose between the two functions, rownames wins on the grounds of aesthetics, since it pairs more prettily with colnames . 由于没有太多的两种功能之间进行选择, rownames胜美学的理由,因为这对更娇滴滴与colnames (Also, for the lazy programmer, you save a character of typing.) (另外,对于懒惰的程序员,你可以保存输入的字符。)

And another expansion: 另一个扩展:

# create dummy matrix
set.seed(10)
m <- matrix(round(runif(25, 1, 5)), 5)
d <- as.data.frame(m)

If you want to assign new column names you can do following on data.frame : 如果要分配新的列名,可以在data.frame上执行以下data.frame

# an identical effect can be achieved with colnames()   
names(d) <- LETTERS[1:5]
> d
  A B C D E
1 3 2 4 3 4
2 2 2 3 1 3
3 3 2 1 2 4
4 4 3 3 3 2
5 1 3 2 4 3

If you, however run previous command on matrix , you'll mess things up: 但是,如果你在matrix上运行上一个命令,你就会搞砸了:

names(m) <- LETTERS[1:5]
> m
     [,1] [,2] [,3] [,4] [,5]
[1,]    3    2    4    3    4
[2,]    2    2    3    1    3
[3,]    3    2    1    2    4
[4,]    4    3    3    3    2
[5,]    1    3    2    4    3
attr(,"names")
 [1] "A" "B" "C" "D" "E" NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA 
[20] NA  NA  NA  NA  NA  NA 

Since matrix can be regarded as two-dimensional vector, you'll assign names only to first five values (you don't want to do that, do you?). 由于矩阵可以被视为二维向量,因此您只需将名称分配给前五个值(您不想这样做,是吗?)。 In this case, you should stick with colnames() . 在这种情况下,你应该坚持使用colnames()

So there... 那么......

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM