简体   繁体   中英

PostgreSQL get serial value from primary key to foreign key

I'm writing a python script that accesses my postgres database. In addition, I'm importing a .csv file to take the data and transfer it to the postgres database.

I have something like this and I apologize if I can't seem to format correctly as I'm new here.

Table foo:

A SERIAL NOT NULL PRIMARY KEY
B
C

Table bar:

D INT NOT NULL PRIMARY KEY
E
F INT REFERENCES foo(A) 

My code looks something like this:

csvfile = open('something.csv')
conn = psycopg2.connect("dbname=something user=something password=something")
cur = conn.cursor()
datareader = csv.DictReader(csvfile, delimiter=',')

for row in datareader:

cur.execute("INSERT INTO bar (D, E, F) VALUES (%s, %s, %s)", (row['some_int'], row['some_char']), [not_sure_what_goes_here])

I do not know how to pull the serial value from table foo .

Basic solution

Basically (but keep reading!), you want to INSERT a row into foo first and use the automatically generated A for the next INSERT into bar .
You could do that with two round trips to the DB, but that would be unreasonably expensive and much more susceptible to concurrency issues. You want to get this done in a single trip to the DB .

In Postgres 9.1+ you can use a data-modifying CTE to execute two inserts in one statement:

cur.execute("""WITH ins AS (
   INSERT INTO foo (B, C)
   VALUES (%s, %s)
   RETURNING A
   )
INSERT INTO bar (D, E, F)
SELECT %s, %s, A
FROM ins""", (val_for_b, val_for_c, val_for_d, val_for_e)

However , the above code implies that you can insert (B, C) into foo unconditionally. That's rarely the case. Typically you have some kind of unique constraint on foo and want to insert (B, C) only if it's not there, yet. You have a " SELECT-or-INSERT " problem, which is further complicated by possible race conditions with concurrent write operations (if possible). The best solution depends on your exact requirements.

Mostly-safe solution

To solve the "SELECT-or-INSERT" problem and also minimize chances for a race condition:

cur.execute("""WITH param(_B, _C) AS (SELECT %s::text, %s::text)
, sel AS (SELECT A FROM foo, param WHERE B = _B AND C = _C FOR SHARE)
, ins AS (
   INSERT INTO foo (B, C)
   SELECT _B, _C FROM param
   WHERE  NOT EXISTS (SELECT 1 FROM sel)
   RETURNING A
   )
INSERT INTO bar (D, E, F)
SELECT %s, %s, COALESCE(sel.A, ins.A)
FROM   sel FULL JOIN ins""", (val_for_b, val_for_c, val_for_d, val_for_e)

Replace the type cast ::text with your actual data types (that you did not provide).
This should be good enough for most use cases. It still leaves a tiny chance for a race condition, when two transaction try to insert the same row in foo concurrently.

Safe solution

To be perfectly sure, consider a server-side function like discussed in this related answer. You'll find detailed explanation for the above SQL code as well:

Then your call can be:

cur.execute("""INSERT INTO bar (D, E, F)
VALUES(%s, %s, f_foo_id(%s, %s))""", (val_for_d, val_for_e, val_for_b, val_for_c)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM