简体   繁体   English

如何从 R 中的交叉表(或 n 维数组中的 dimname 组)中采样 rowname-colname 对?

[英]How to sample rowname-colname pairs from a crosstab (or dimname groups from an n-dimensional array) in R?

In R it is quite trivial to "collapse" an n-dimensional array into a one-dimensional column vector and sample from that using eg sample() function in base R.在 R 中,将 n 维数组“折叠”为一维列向量并使用 sample() function 在基数 R 中对其进行采样是非常简单的。

However, I would like to sample dimnames-groups (ie rowname-colname pairs in case of a two-dimensional array) based on the frequencies.但是,我想根据频率对 dimnames-groups(即 rowname-colname 对,在二维数组的情况下)进行采样。

Let's have an example, and assume we have a following crosstab (the data (n=70) is randomly generated):让我们举个例子,假设我们有以下交叉表(数据 (n=70) 是随机生成的):

Man掌管 Woman女士
Smoking抽烟 10 10 20 20
Non-smoking禁止吸烟 15 15 25 25

How do I sample from this that I get:我如何从中抽样得到:

  • "Smoking Man" with probability: 10 / 70 “吸烟者”概率:10 / 70
  • "Non-smoking Man" with probability: 15 / 70 “非吸烟者”概率:15 / 70
  • "Smoking Woman" with probability: 20 / 70 “吸烟的女人”概率:20 / 70
  • "Non-smoking Woman" with probability: 25 / 70 “禁烟女人”概率:25 / 70

The easiest way would probably be grouping the dimnames (somehow), and use this as the first argument of sample function ie:最简单的方法可能是将 dimnames 分组(以某种方式),并将其用作示例 function 的第一个参数,即:

sample(x = vectorOfGroupedDimnames, size = 1, prob = c(crosstabAsMatrix))

Yes, and I know that the variable vectorOfGroupedDimnames can be formed using nested for loops, but there has to be more elegant ways of doing this.是的,我知道变量 vectorOfGroupedDimnames 可以使用嵌套的 for 循环形成,但必须有更优雅的方法来实现。

So what is the easiest way to do this?那么最简单的方法是什么? Thanks.谢谢。

Maybe this will help you

library(dplyr)

data <-
  structure(c(25L, 20L, 15L, 10L), .Dim = c(2L, 2L), .Dimnames = list(
    smoke = c("Non-smoking", "Smoking"), sex = c("Female", "Male"
    )), class = "table")

data %>% 
  as_tibble() %>% 
  sample_n(size = 1,weight = n,replace = TRUE)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM