R SQL create a pairwise dataset without repetitions

Question

I am working on R and SQL (package sqldf) on a dataset like the following one:

View(dataset)

key1    key2    id    ...
01/01   XXX     A     ...
01/01   XXX     B     ...
01/01   YYY     C     ...
01/01   YYY     D     ...
02/01   XXX     A     ...
02/01   XXX     B     ...
02/01   XXX     C     ...

I would like to create a pairwise dataset with one pair for each group identified by key1 and key2, as following:

key1    key2    id_1    id_2    
01/01   XXX     A       B  
01/01   YYY     C       D
02/01   XXX     A       B
02/01   XXX     A       C
02/01   XXX     C       B

I have used

sqldf(c('select a.key1, a.key2, a.id as id_1, 
                  b.id as id_2 
                  from dataset a
                  inner join dataset b on a.key1=b.key2 and a.key2=b.key2  and a.id!=b.id'))

The problem is that with this query I obtain

key1    key2    id_1    id_2    
01/01   XXX     A       B
01/01   XXX     B       A    
01/01   YYY     C       D
01/01   YYY     D       C
02/01   XXX     A       B
02/01   XXX     B       A
02/01   XXX     A       C
02/01   XXX     C       A
02/01   XXX     C       B
02/01   XXX     B       C

I would like to avoid repetitions, since I want to make some comparaisons and it doesn't matter which id is put in the column id_1 and which in id_2.

Thank you very much!

Answer 1

将连接条件从a.id != b.id更改为a.id < b.id

R SQL create a pairwise dataset without repetitions

Question

1 answers

solution1
3 ACCPTED 2017-02-14 15:44:38

R SQL create a pairwise dataset without repetitions

Question

1 answers

solution1 3 ACCPTED 2017-02-14 15:44:38

solution1
3 ACCPTED 2017-02-14 15:44:38