简体   繁体   English

在data.table中提取组内所有可能的对

[英]Extract all possible pairs within group in data.table

Currently, I am struggling with a problem related to obtaining all possible permutation within group using data.table. 目前,我正在努力解决与使用data.table获取组内所有可能的排列相关的问题。

To explain my problem, let me show you an example. 为了解释我的问题,让我给你举个例子。

x <- c(1, 1, 1, 2, 2)
y <- c('red', 'blue', 'black', 'orange', 'red')
dt1 <- as.data.table(cbind(x,y))
dt1
   x      y
1: 1    red
2: 1   blue
3: 1  black
4: 2 orange
5: 2    red

Now I want to see every possible pair of color(y) within group(x). 现在我想在组(x)中看到每一对可能的颜色(y)。 So my ideal result would be.... 所以我的理想结果将是....

x   y1      y2
1   black   blue
1   black   red
1   blue    black
1   blue    red
1   red     black
1   red     blue
2   orange  red
2   red     orange

To find a solution for this, I did googling it and I found a function, permutation, which is what I am looking for but I find it hard to squeeze it into data.table framework. 为了找到一个解决方案,我搜索了它,我发现了一个函数,排列,这是我正在寻找的,但我发现很难将它挤入data.table框架。

y <- c('red', 'blue', 'black')
permutations(n=3, r=2, v=y, repeats.allowed=F)

     [,1]    [,2]   
[1,] "black" "blue" 
[2,] "black" "red"  
[3,] "blue"  "black"
[4,] "blue"  "red"  
[5,] "red"   "black"
[6,] "red"   "blue" 

So I tried to do the following but obviously it has errors.. 所以我试着做以下但很明显它有错误..

dt1[, .(j = lapply(.SD, permutations, n=.N, r=2, v=y, repeats.allowed=F)), by=x]

Any suggestion for this? 有什么建议吗? I will really appreciate it. 我真的很感激。

First off, don't use as.data.table(cbind(...)) to create the data table. 首先,不要使用as.data.table(cbind(...))来创建数据表。 You will get unexpected column classes due to cbind coercing to matrix. 由于cbind强制转换为矩阵,您将获得意外的列类。 Use 使用

dt1 <- data.table(x, y)

That said, you can do 那就是说,你可以做到

dt1[, {
        p <- gtools::permutations(.N, 2, y, repeats=FALSE)
        .(y1 = p[, 1], y2 = p[, 2])
    }, by = x]

which gives 这使

  x y1 y2 1: 1 black blue 2: 1 black red 3: 1 blue black 4: 1 blue red 5: 1 red black 6: 1 red blue 7: 2 orange red 8: 2 red orange 

There is no need to loop since we are operating on groups. 由于我们在组上运行,因此无需循环。 permutations creates a matrix, so we create our desired result columns from the resulting matrix columns of permutations . permutations创建一个矩阵,因此我们从所得到的permutations矩阵列中创建所需的结果列。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM