combining join with “or” in data.table package

Question

dt <- data.table(X=rnorm(10),a=rep(0:1,length=10),b=rep(0:1,each=5))
dt
             X a b
1:  0.08848742 0 0
2: -1.36578648 1 0
3: -1.01563937 0 0
4:  0.36562936 1 0
5:  2.04250239 0 0
6:  1.33698124 1 1
7: -1.38358719 0 1
8: -0.14395236 1 1
9: -1.36277622 0 1
10:  0.40818281 1 1    

setkey(dt,a,b)
dt[J(1,1),]

This is a way to get all lines where both a and b are 1. Is there a way to pick those lines where either a or b is 1 ? In other words: to get all lines in dt except for line 1,3 and 5?

Answer 1

I don't think there's a direct way to do an OR operation. However, you can use simple logical equivalence (A OR B) == !(Ac and Bc) to deduce that what you need is !J(0, 0) .

Just do:

dt[!J(0, 0)]

            X a b
1:  0.7768113 0 1
2:  0.2439950 0 1
3: -0.2095353 1 0
4:  2.9267934 1 0
5: -0.1437019 1 1
6:  1.5120883 1 1
7: -0.4462240 1 1

Answer 2

I've been doing this sort of thing lately:

kvals = CJ(a=0:1,b=0:1)
dt[kvals[a|b]]

"kvals" stores all possible values for the key. CJ is the same as expand.grid , as far as I can tell: it takes all combinations of the vectors passed to it.

Answer 3

Why can't you just do that as an ordinary i-selection operation?

> dt[a==1&b==1,]
            X a b
1: -0.1186037 1 1
2: -0.1166594 1 1
3:  0.2622407 1 1
> dt[a==1|b==1,]
             X a b
1: -0.69037968 0 1
2:  1.63492922 0 1
3: -0.09240386 1 0
4:  0.55300691 1 0
5: -0.11860370 1 1
6: -0.11665936 1 1
7:  0.26224070 1 1

combining join with “or” in data.table package

Question

3 answers

solution1
3 ACCPTED 2013-08-09 14:27:29

solution2
3 2013-08-09 15:18:05

solution3
1 2013-08-09 15:50:59

combining join with “or” in data.table package

Question

3 answers

solution1 3 ACCPTED 2013-08-09 14:27:29

solution2 3 2013-08-09 15:18:05

solution3 1 2013-08-09 15:50:59

solution1
3 ACCPTED 2013-08-09 14:27:29

solution2
3 2013-08-09 15:18:05

solution3
1 2013-08-09 15:50:59