简体   繁体   English

R所有可能的组合

[英]R all possible combinations

I have fallowing data frame: 我有休闲数据框:

> my.df
          x         y
1 0.4597406 0.8439140
2 0.4579697 0.7461805
3 0.5593259 0.6646701
4 0.3607346 0.7792931
5 0.8377520 1.0445919
6 0.5597406 1.0445919

I want to create all possible combinations 我想创建所有可能的组合

> my.df
          x         y
1 0.4597406 0.8439140
2 0.4597406 0.7461805
3 0.4597406 0.6646701
4 0.4597406 0.7792931
5 0.4597406 1.0445919
6 0.4597406 1.0445919
7 0.4579697 0.8439140
8 0.4579697 0.7461805
9 0.4579697 0.6646701
... 
(Not all the combinations are showing here - This is to show the format that I would like to get the resulting data frame)

Using following functions didn't really give the exact combinations. 使用以下函数并没有真正给出确切的组合。

expand.grid(my.df)

Whats the best way to generate all possible combinations. 生成所有可能组合的最佳方法是什么?

也许我们可以通过以下方式使用expand.grid

expand.grid(x = my.df$x, y = my.df$y)

We can just use expand.grid 我们可以只使用expand.grid

res <- expand.grid(my.df)
dim(res)
#[1] 36  2

Or with data.table 或与data.table

library(data.table)
setDT(my.df)[,CJ(x,y)]

A Cross Join is helpful in this situation. Cross Join在这种情况下很有帮助。 Since you didnt provide a reproducible example. 由于您没有提供可复制的示例。 I have create my own datset. 我创建了自己的数据集。

df=data.frame(x=runif(5), y=runif(5))
xx=data.frame(df$x)
yy=data.frame(df$y)
library(sqldf)
sqldf("SELECT * FROM xx CROSS JOIN yy")

expand.grid() will give you all the possible combinations but not the unique combinations. expand.grid()将为您提供所有可能的组合,但不能为您提供唯一的组合。 If you need the latter you can use a function like this 如果需要后者,可以使用类似这样的函数

unique_comb <- function(data){
   x.cur <- unique(data$x)
   y.cur <- unique(data$y)
   n.x <- length(x.cur)
   n.y <- length(y.cur)
   matrix.com <- matrix(0,ncol=2,nrow=n.x*n.y)
   ind <- 1
   for(i in 1:n.x){
       for(j in 1:n.y){
          matrix.com[ind,] <- c(x.cur[i],y.cur[j])
         ind <- ind+1
       }
   }
   return(matrix.com)
}

Or as JTT points that this can be done in one line with 或者正如JTT指出的那样,这可以与

expand.grid(unique(data$x),unique(data$y)) 

You could use the merge function this way 您可以通过这种方式使用合并功能

dat <- cars[1:6,1:2]
dat
  speed dist
1     4    2
2     4   10
3     7    4
4     7   22
5     8   16
6     9   10

merge(dat$speed,dat$dist,by=NULL)
   x  y
1  4  2
2  4  2
3  7  2
4  7  2
5  8  2
6  9  2
7  4 10
8  4 10
9  7 10
10 7 10
11 8 10
12 9 10
13 4  4
14 4  4
15 7  4
16 7  4
17 8  4
18 9  4
19 4 22
20 4 22
21 7 22
22 7 22
23 8 22
24 9 22
25 4 16
26 4 16
27 7 16
28 7 16
29 8 16
30 9 16
31 4 10
32 4 10
33 7 10
34 7 10
35 8 10
36 9 10

I know everyone's throwing expand.grid() at you, so here's another option... 我知道每个人都向您抛出expand.grid() ,所以这是另一个选择...

my.df <- structure(list(x = c(0.4597406, 0.4579697, 0.5593259, 0.3607346, 0.837752, 0.5597406), 
                        y = c(0.843914, 0.7461805, 0.6646701, 0.7792931, 1.0445919, 1.0445919)), 
                   .Names = c("x", "y"), row.names = c(NA, -6L), class = "data.frame")

my.df
#>           x         y
#> 1 0.4597406 0.8439140
#> 2 0.4579697 0.7461805
#> 3 0.5593259 0.6646701
#> 4 0.3607346 0.7792931
#> 5 0.8377520 1.0445919
#> 6 0.5597406 1.0445919

tidyr has a complete() function which "completes" your data combinations, which I believe is what you're after. tidyr具有complete()函数,可以“完成”您的数据组合,我相信您所追求的是。

tidyr::complete(my.df, x, y)
#> # A tibble: 30 x 2
#>            x         y
#>        <dbl>     <dbl>
#> 1  0.3607346 0.6646701
#> 2  0.3607346 0.7461805
#> 3  0.3607346 0.7792931
#> 4  0.3607346 0.8439140
#> 5  0.3607346 1.0445919
#> 6  0.4579697 0.6646701
#> 7  0.4579697 0.7461805
#> 8  0.4579697 0.7792931
#> 9  0.4579697 0.8439140
#> 10 0.4579697 1.0445919
#> # ... with 20 more rows

Note: this produces the unique combinations - your expected output rows 5 and 6 are identical. 注意:这将产生唯一的组合-您的预期输出第5和第6行是相同的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM