如何在矩阵的行上复制相同的函数

Question

I am trying to write a loop that determines which cell has the greatest value and select that cell as a result with a high medium or low string. 我正在尝试编写一个循环，该循环确定哪个单元格具有最大的值，并选择具有高中或低字符串的结果作为该单元格。 Here is the data for try out. 这是要试用的数据。

data <- matrix(c(0.3000003,0.3299896,0.3700101,
                 0.3299896,0.3700101,0.3000003,
                 0.3700101,0.3000003,0.3299896,
                 0.3000003,0.3299896,0.3700101,
                 0.3299896,0.3700101,0.3000003,
                 0.3700101,0.3000003,0.3299896),6,3)
colnames(data) <- c("Low","Medium","High")
rownames(data) <- paste("case",1:6)

> data
             Low    Medium      High
case 1 0.3000003 0.3700101 0.3299896
case 2 0.3299896 0.3000003 0.3700101
case 3 0.3700101 0.3299896 0.3000003
case 4 0.3299896 0.3000003 0.3700101
case 5 0.3700101 0.3299896 0.3000003
case 6 0.3000003 0.3700101 0.3299896

I am using this function but it seems like it is only calculating the first row. 我正在使用此函数，但似乎只在计算第一行。

assign.levels <- function(data) {

  for (i in nrow(data)) {

    scored.thetas.1 <- names(which.max(data[i,1:3])) ## I wrote 1:3 here because I have multiple columns in the original dataset.
    return(scored.thetas.1)

  }
}


> assign.levels(data)
[1] "Medium"

Any thoughts? 有什么想法吗？

Thanks in advance! 提前致谢！

Answer 1

Here's a vectorized solution that you may prefer: 这是您可能更喜欢的矢量化解决方案：

colnames(data)[apply(data, 1, which.max)]
# [1] "Medium" "High"   "Low"    "High"   "Low"    "Medium"

That's a concise version of your attempt: apply the function which.max to each row (dimension 1 ) of data and get a corresponding column name. 这就是你尝试的简明版： apply功能which.max每一行（尺寸1 ）的data ，并得到相应的列名。

In terms of your attempt, here's a corrected version: 根据您的尝试，这是一个更正的版本：

assign.levels <- function(data) {
  scored.thetas.1 <- rep(NA, nrow(data))
  for (i in 1:nrow(data))
    scored.thetas.1[i] <- names(which.max(data[i, ]))
  scored.thetas.1
}
assign.levels(data)
# [1] "Medium" "High"   "Low"    "High"   "Low"    "Medium"

Several things to mention about your attempt: 1) you were iterating with i in nrow(data) , while nrow(data) is just a number. 关于您的尝试，有几件事需要提及：1）您i in nrow(data)中用i in nrow(data)进行了迭代，而nrow(data)只是一个数字。 So basically you were looking only at the last row; 因此，基本上，您只查看最后一行； 2) you kept redefining the same variable scored.thetas.1 in every iteration (in this case there was only one iteration, but the tendency was bad); 2）您在每次迭代中都重新定义了相同的scored.thetas.1变量（在这种情况下，只有一个迭代，但是趋势很差）； 3) a loop is not a function, you don't need to return anything from it and instead you most likely want to store somewhere your newly obtained values. 3）循环不是一个函数，您不需要从中返回任何内容，而是您很可能想将新获得的值存储在某个地方。

In comparison, note that first I define an empty vector scored.thetas.1 of length nrow(data) . 相比之下，请注意，首先我定义了一个长度为nrow(data)的空矢量scored.thetas.1 。 Then I iterate over all the rows ( 1:nrow(data) ) and store a value for each row/iteration to scored.thetas.1[i] . 然后，我遍历所有行（ 1:nrow(data) ），并将每个行/迭代的值存储到scored.thetas.1[i] 。

Answer 2

This should be fast 这应该很快

colnames(data)[max.col(data)]
#[1] "Medium" "High"   "Low"    "High"   "Low"    "Medium"

Here is a little benchmark. 这是一个小基准。

n <- 1e6
set.seed(1)
data <- matrix(runif(n * 3), ncol = 3)
colnames(data) <- c("Low","Medium","High")

library(microbenchmark)

benchmark <- microbenchmark(
  OP = assign.levels(data), # as defined in Julius's answer
  Julius = colnames(data)[apply(data, 1, which.max)],
  markus = colnames(data)[max.col(data)], times = 20
)

autoplot(benchmark)

如何在矩阵的行上复制相同的函数

问题描述

2 个解决方案

解决方案1
2 2018-10-30 20:33:06

解决方案2
2 已采纳 2018-10-30 20:35:56

如何在矩阵的行上复制相同的函数

问题描述

2 个解决方案

解决方案1 2 2018-10-30 20:33:06

解决方案2 2 已采纳 2018-10-30 20:35:56

解决方案1
2 2018-10-30 20:33:06

解决方案2
2 已采纳 2018-10-30 20:35:56