简体   繁体   English

在 R 中的应用语句中引用列名和行名

[英]Refer to column name and row name within an apply statement in R

I have a dataframe in R which looks like the one below.我在 R 中有一个数据框,如下所示。

    a b c d e f
    0 1 1 0 0 0
    1 1 1 1 0 1
    0 0 0 1 0 1
    1 0 0 1 0 1
    1 1 1 0 0 0

The database is big, spanning over 100 columns and 5000 rows and contain all binaries (0's and 1's).该数据库很大,跨越 100 多列和 5000 行,包含所有二进制文件(0 和 1)。 I want to construct an overlap between each and every columns in R. Something like the one given below.我想在 R 中的每一列之间构建一个重叠。就像下面给出的那样。 This overlap dataframe will be a square matrix with equal number of rows and columns and that will be same as the number of columns in the 1st dataframe.此重叠数据帧将是一个方阵,其行数和列数相同,并且与第一个数据帧中的列数相同。

      a b c d e f
    a 3 2 2 2 0 2
    b 2 3 3 3 0 1
    c 2 3 3 1 0 1
    d 2 3 1 3 0 3
    e 0 0 0 0 0 0
    f 2 1 1 3 0 3

Each cell of the second dataframe is populated by the number of cases where both row and column have 1 in the first dataframe.第二个数据帧的每个单元格由第一个数据帧中行和列都为 1 的情况数填充。

I'm thinking of constructing a empty matrix like this:我正在考虑构建一个像这样的空矩阵:

    df <- matrix(ncol = ncol(data), nrow = ncol(data))
    colnames(df) <- names(data)
    rownames(df) <- names(data)

.. and iterating over each cell of this matrix using an apply command reading the corresponding row name (say, x) and column name (say, y) and running a function like the one below. .. 并使用 apply 命令读取相应的行名称(例如 x)和列名称(例如 y)并运行如下所示的函数来迭代该矩阵的每个单元格。

    summation <- function (x,y) (return (sum(data$x * data$y)))

The problem with is I can't find out the row name and column name while within an apply function.问题是我在应用函数中找不到行名和列名。 Any help will be appreciated.任何帮助将不胜感激。

Any more efficient way than what I'm thinking is more than welcome.任何比我想的更有效的方法都非常受欢迎。

You are looking for crossprod您正在寻找crossprod

crossprod(as.matrix(df1))
#  a b c d e f
#a 3 2 2 2 0 2
#b 2 3 3 1 0 1
#c 2 3 3 1 0 1
#d 2 1 1 3 0 3
#e 0 0 0 0 0 0
#f 2 1 1 3 0 3

data数据

df1 <- structure(list(a = c(0L, 1L, 0L, 1L, 1L), b = c(1L, 1L, 0L, 0L, 
1L), c = c(1L, 1L, 0L, 0L, 1L), d = c(0L, 1L, 1L, 1L, 0L), e = c(0L, 
0L, 0L, 0L, 0L), f = c(0L, 1L, 1L, 1L, 0L)), .Names = c("a", 
"b", "c", "d", "e", "f"), class = "data.frame", row.names = c(NA, 
-5L))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM