[英]R all possible combinations
I have fallowing data frame: 我有休闲数据框:
> my.df
x y
1 0.4597406 0.8439140
2 0.4579697 0.7461805
3 0.5593259 0.6646701
4 0.3607346 0.7792931
5 0.8377520 1.0445919
6 0.5597406 1.0445919
I want to create all possible combinations 我想创建所有可能的组合
> my.df
x y
1 0.4597406 0.8439140
2 0.4597406 0.7461805
3 0.4597406 0.6646701
4 0.4597406 0.7792931
5 0.4597406 1.0445919
6 0.4597406 1.0445919
7 0.4579697 0.8439140
8 0.4579697 0.7461805
9 0.4579697 0.6646701
...
(Not all the combinations are showing here - This is to show the format that I would like to get the resulting data frame)
Using following functions didn't really give the exact combinations. 使用以下函数并没有真正给出确切的组合。
expand.grid(my.df)
Whats the best way to generate all possible combinations. 生成所有可能组合的最佳方法是什么?
也许我们可以通过以下方式使用expand.grid
expand.grid(x = my.df$x, y = my.df$y)
We can just use expand.grid
我们可以只使用
expand.grid
res <- expand.grid(my.df)
dim(res)
#[1] 36 2
Or with data.table
或与
data.table
library(data.table)
setDT(my.df)[,CJ(x,y)]
A Cross Join
is helpful in this situation. Cross Join
在这种情况下很有帮助。 Since you didnt provide a reproducible example. 由于您没有提供可复制的示例。 I have create my own datset.
我创建了自己的数据集。
df=data.frame(x=runif(5), y=runif(5))
xx=data.frame(df$x)
yy=data.frame(df$y)
library(sqldf)
sqldf("SELECT * FROM xx CROSS JOIN yy")
expand.grid() will give you all the possible combinations but not the unique combinations. expand.grid()将为您提供所有可能的组合,但不能为您提供唯一的组合。 If you need the latter you can use a function like this
如果需要后者,可以使用类似这样的函数
unique_comb <- function(data){
x.cur <- unique(data$x)
y.cur <- unique(data$y)
n.x <- length(x.cur)
n.y <- length(y.cur)
matrix.com <- matrix(0,ncol=2,nrow=n.x*n.y)
ind <- 1
for(i in 1:n.x){
for(j in 1:n.y){
matrix.com[ind,] <- c(x.cur[i],y.cur[j])
ind <- ind+1
}
}
return(matrix.com)
}
Or as JTT points that this can be done in one line with 或者正如JTT指出的那样,这可以与
expand.grid(unique(data$x),unique(data$y))
You could use the merge function this way 您可以通过这种方式使用合并功能
dat <- cars[1:6,1:2]
dat
speed dist
1 4 2
2 4 10
3 7 4
4 7 22
5 8 16
6 9 10
merge(dat$speed,dat$dist,by=NULL)
x y
1 4 2
2 4 2
3 7 2
4 7 2
5 8 2
6 9 2
7 4 10
8 4 10
9 7 10
10 7 10
11 8 10
12 9 10
13 4 4
14 4 4
15 7 4
16 7 4
17 8 4
18 9 4
19 4 22
20 4 22
21 7 22
22 7 22
23 8 22
24 9 22
25 4 16
26 4 16
27 7 16
28 7 16
29 8 16
30 9 16
31 4 10
32 4 10
33 7 10
34 7 10
35 8 10
36 9 10
I know everyone's throwing expand.grid()
at you, so here's another option... 我知道每个人都向您抛出
expand.grid()
,所以这是另一个选择...
my.df <- structure(list(x = c(0.4597406, 0.4579697, 0.5593259, 0.3607346, 0.837752, 0.5597406),
y = c(0.843914, 0.7461805, 0.6646701, 0.7792931, 1.0445919, 1.0445919)),
.Names = c("x", "y"), row.names = c(NA, -6L), class = "data.frame")
my.df
#> x y
#> 1 0.4597406 0.8439140
#> 2 0.4579697 0.7461805
#> 3 0.5593259 0.6646701
#> 4 0.3607346 0.7792931
#> 5 0.8377520 1.0445919
#> 6 0.5597406 1.0445919
tidyr
has a complete()
function which "completes" your data combinations, which I believe is what you're after. tidyr
具有complete()
函数,可以“完成”您的数据组合,我相信您所追求的是。
tidyr::complete(my.df, x, y)
#> # A tibble: 30 x 2
#> x y
#> <dbl> <dbl>
#> 1 0.3607346 0.6646701
#> 2 0.3607346 0.7461805
#> 3 0.3607346 0.7792931
#> 4 0.3607346 0.8439140
#> 5 0.3607346 1.0445919
#> 6 0.4579697 0.6646701
#> 7 0.4579697 0.7461805
#> 8 0.4579697 0.7792931
#> 9 0.4579697 0.8439140
#> 10 0.4579697 1.0445919
#> # ... with 20 more rows
Note: this produces the unique combinations - your expected output rows 5 and 6 are identical. 注意:这将产生唯一的组合-您的预期输出第5和第6行是相同的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.