简体   繁体   English

在R中创建多个计数器矩阵

[英]Creating a matrix of multiple counters in R

So, my goal is to take an input vector and to make an output matrix of different counters. 因此,我的目标是采用一个输入向量并制作一个包含不同计数器的输出矩阵。 So every time a value appears in my inputs, I want to find that counter and iterate it by 1. I understand that I'm not good at explaining this, so I illustrated a simple version below. 因此,每次输入中出现一个值时,我都想找到该计数器并对其进行1的迭代。我知道我不太善于解释这一点,因此在下面举例说明了一个简单的版本。 However, I want to make 2 changes which I will enumerate after the example so that it makes sense. 但是,我想进行2个更改,在示例之后将对其进行列举,以便使之有意义。

nums = c(1,2,3,4,5,1,2,4,3,5)
unis = unique(nums)
counter = matrix(NA, nrow = length(nums), ncol = length(unis))
colnames(counter) = unis
for (i in 1:length(nums)){
  temp = nums[i]
  if (i == 1){
    counter[1,] = 0
    counter[1,temp] = 1
  } else {
    counter[i,] = counter[i-1,]
    counter[i,temp] = counter[i-1,temp]+1
  }
}
counter

which outputs 哪个输出

 > counter
      1 2 3 4 5
 [1,] 1 0 0 0 0
 [2,] 1 1 0 0 0
 [3,] 1 1 1 0 0
 [4,] 1 1 1 1 0
 [5,] 1 1 1 1 1
 [6,] 2 1 1 1 1
 [7,] 2 2 1 1 1
 [8,] 2 2 1 2 1
 [9,] 2 2 2 2 1
[10,] 2 2 2 2 2

The 2 modifications. 2个修改。 1) Since the real data is much larger, I would want to do this using apply or however people who know R better than me says it should be done. 1)由于实际数据要大得多,因此我想使用apply进行操作,但是比我更了解R的人说应该这样做。 2) Whereas the input is a vector where each element is only an element, how could this be generalized if an element of a vector was a tuple? 2)输入是一个向量,其中每个元素只是一个元素,如果向量的元素是一个元组,如何将其概括化? For example (if nums was a tuple of 4 and 5, then it would iterate both in that step and the last line of the output would then be 2,2,2,3,2) 例如(如果nums是4和5的元组,那么它将在该步骤中都进行迭代,并且输出的最后一行将是2,2,2,3,2)

Thanks and if you don't understand please ask questions and I'll try to clarify 谢谢,如果您听不懂,请提出问题,我会尽力澄清

Using the Matrix package (which ships with a standard installation of R) 使用Matrix软件包(R的标准安装附带)

nums <- c(1,2,3,4,5,1,2,4,3,5)
apply(Matrix::sparseMatrix(i=seq_along(nums), j=nums), 2, cumsum)
#      [,1] [,2] [,3] [,4] [,5]
#  [1,]    1    0    0    0    0
#  [2,]    1    1    0    0    0
#  [3,]    1    1    1    0    0
#  [4,]    1    1    1    1    0
#  [5,]    1    1    1    1    1
#  [7,]    2    2    1    1    1
#  [8,]    2    2    1    2    1
#  [9,]    2    2    2    2    1
# [10,]    2    2    2    2    2

Note that this behaves a bit differently in a couple of ways from thelatemail's suggested solution. 请注意,此行为的行为与最新邮件建议的解决方案有两种不同。 Which behavior you prefer will depend on what you are using this for. 您更喜欢哪种行为将取决于您使用它的目的。

Here's a small example that illustrates the differences: 这是一个说明差异的小示例:

nums <- c(5,2,1,1)

# My suggestion
apply(Matrix::sparseMatrix(i=seq_along(nums), j=nums), 2, cumsum)
#      [,1] [,2] [,3] [,4] [,5]
# [1,]    0    0    0    0    1
# [2,]    0    1    0    0    1
# [3,]    1    1    0    0    1
# [4,]    2    1    0    0    1

# @thelatemail's suggestion
sapply(unique(nums), function(x) cumsum(nums==x) )
#      [,1] [,2] [,3]
# [1,]    1    0    0
# [2,]    1    1    0
# [3,]    1    1    1
# [4,]    1    1    2

For your second question, you could do something like this: 对于第二个问题,您可以执行以下操作:

nums <- list(1,2,3,4,5,1,2,4,3,c(4,5))

ii <- rep(seq_along(nums), times=lengths(nums)) ## lengths() is in R>=3.2.0
jj <- unlist(nums)
apply(Matrix::sparseMatrix(i=ii, j=jj), 2, cumsum)
#       [,1] [,2] [,3] [,4] [,5]
#  [1,]    1    0    0    0    0
#  [2,]    1    1    0    0    0
#  [3,]    1    1    1    0    0
#  [4,]    1    1    1    1    0
#  [5,]    1    1    1    1    1
#  [6,]    2    1    1    1    1
#  [7,]    2    2    1    1    1
#  [8,]    2    2    1    2    1
#  [9,]    2    2    2    2    1
# [10,]    2    2    2    3    2

For your first query, you can get there with something like: 对于第一个查询,您可以通过以下方式到达那里:

sapply(unique(nums), function(x) cumsum(nums==x) )

 #      [,1] [,2] [,3] [,4] [,5]
 # [1,]    1    0    0    0    0
 # [2,]    1    1    0    0    0
 # [3,]    1    1    1    0    0
 # [4,]    1    1    1    1    0
 # [5,]    1    1    1    1    1
 # [6,]    2    1    1    1    1
 # [7,]    2    2    1    1    1
 # [8,]    2    2    1    2    1
 # [9,]    2    2    2    2    1
 #[10,]    2    2    2    2    2

Another idea: 另一个想法:

do.call(rbind, Reduce("+", lapply(nums, tabulate, max(unlist(nums))), accumulate = TRUE))
#      [,1] [,2] [,3] [,4] [,5]
# [1,]    1    0    0    0    0
# [2,]    1    1    0    0    0
# [3,]    1    1    1    0    0
# [4,]    1    1    1    1    0
# [5,]    1    1    1    1    1
# [6,]    2    1    1    1    1
# [7,]    2    2    1    1    1
# [8,]    2    2    1    2    1
# [9,]    2    2    2    2    1
#[10,]    2    2    2    2    2

And generally: 通常:

x = list(1, 3, 6, c(6, 3), 2, c(4, 6, 1), c(1, 2), 3)
do.call(rbind, Reduce("+", lapply(x, tabulate, max(unlist(x))), accumulate = TRUE))
#     [,1] [,2] [,3] [,4] [,5] [,6]
#[1,]    1    0    0    0    0    0
#[2,]    1    0    1    0    0    0
#[3,]    1    0    1    0    0    1
#[4,]    1    0    2    0    0    2
#[5,]    1    1    2    0    0    2
#[6,]    2    1    2    1    0    3
#[7,]    3    2    2    1    0    3
#[8,]    3    2    3    1    0    3

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM