简体   繁体   中英

Does postgres copy command amend to or replace the table I import the table into?

The code I am using is this:

import psycopg2
import pandas as pd
import sys

def pg_load_table(file_path, table_name, dbname, host, port, user, pwd):
    '''
    This function upload csv to a target table
    '''
    try:
        conn = psycopg2.connect(dbname=dbname, host=host, port=port,\
         user=user, password=pwd)
        print("Connecting to Database")
        cur = conn.cursor()
        f = open(file_path, "r")
        # Truncate the table first
        cur.execute("Truncate {} Cascade;".format(table_name))
        print("Truncated {}".format(table_name))
        # Load table from the file with header
        cur.copy_expert("copy {} from STDIN CSV HEADER QUOTE '\"'".format(table_name), f)
        cur.execute("commit;")
        print("Loaded data into {}".format(table_name))
        conn.close()
        print("DB connection closed.")

    except Exception as e:
        print("Error: {}".format(str(e)))
        sys.exit(1)

# Execution Example
file_path = '/tmp/restaurants.csv'
table_name = 'usermanaged.restaurants'
dbname = 'db name'
host = 'host url'
port = '5432'
user = 'username'
pwd = 'password'
pg_load_table(file_path, table_name, dbname, host, port, user, pwd)

I expected it to append to my data, but the input file ended up replacing my table. How can I edit this line:

cur.copy_expert("copy {} from STDIN CSV HEADER QUOTE '\"'".format(table_name), f)

(or more of the code, if neccesary) to make the command append instead of replace? Alternatively, could this support the syntax of an update SQL command based on a where clause?

As Mike Organek notes in the comments, this line removes all data from your table:

cur.execute("Truncate {} Cascade;".format(table_name))

Remove that, and you'll find your data will be appended by the COPY operation.

Note: this means if your CSV data combined with the existing data in the table violates any constraints (say, unique keys...), the entire transaction will fail and you'll get NO new data in your table. If you need to perform an "Upsert", see: How to UPSERT (MERGE, INSERT... ON DUPLICATE UPDATE) in PostgreSQL?

(FYI: your question was flagged as potentially invalid due to a typo, but given the explicit nature of the truncate call combined with your last paragraph, I suspect this logic was constructed to avoid just such a constraint violation; COPY is best left to bulk loads, with more flexible approaches used for updates.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM