简体   繁体   中英

Creating new rows in another table for foreign keys

I have a structure similar to

CREATE TABLE b (b_id SERIAL PRIMARY KEY,
                data INTEGER NULL);

CREATE TABLE a (a_id INTEGER PRIMARY KEY,
                b_id INTEGER NULL REFERENCES b(b_id),
                attrib INTEGER);                 
                
INSERT INTO b (data) values(2);
INSERT INTO b (data) values(3);
INSERT INTO b (data) values(4);
INSERT INTO b (data) values(5);

INSERT INTO a VALUES(1, 1, 13);
INSERT INTO a VALUES(2, 2, 12);
INSERT INTO a VALUES(3, 3, 13);
INSERT INTO a VALUES(4, NULL, 14);
INSERT INTO a VALUES(5, NULL, 15);
INSERT INTO a VALUES(6, NULL, 16);

where I'd want to create new distinct rows and in b and link those so every row in a links to a . Several rows in a can link to the same row in b but each row with a NULL b_id should result in and link to a new row in b .

There's no link between the tables except b_id . b_id can be assumed to be any automatic type (the only additional requirement is that it should be suitable as a primary key).

It's nice if it can be done in a single statement, but that's not a requirement.

I can easily create new rows in b corresponding to the rows with missing b_id in a :

INSERT INTO b (data) 
SELECT NULL 
FROM a WHERE a.b_id IS NULL;

but have so far not managed to construct an update that does what I want (I tend to end up with all entries in a linking to the same newly created row, as eg

UPDATE a set b_id = bx.b_id 
FROM b bx 
WHERE NOT EXISTS (SELECT 1 
                  FROM a ax 
                  WHERE ax.b_id = bx.b_id) 
  AND a.b_id IS NULL;

I've done versions without subselect as well, but with the same result.

There are many possible result after a correct solution (for this example), one would be

a_id b_id attrib
1 1 13
2 2 12
3 3 13
4 4 14
5 5 15
6 6 16

but equally valid would be

a_id b_id attrib
1 1 13
2 2 12
3 3 13
4 6 14
5 4 15
6 5 16

or

a_id b_id attrib
1 1 13
2 2 12
3 3 13
4 100 14
5 200 15
6 300 16

A test I believe should work for this example is that

WITH c AS (
  SELECT b_id, COUNT(*) OVER (PARTITION BY b_id) 
  FROM a
) 
SELECT MAX(count) FROM c;

should return 1 and

SELECT COUNT(*) FROM a WHERE b_id is NULL;

should return 0 .

The primary environment for this is Postgres, but I'd like to keep this as portable as possible. I'd also like to avoid using WHILE if possible.

Playground set up: http://sqlfiddle.com/#!17/2262c/1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM