简体   繁体   中英

How to check a row already exists before copying the data from csv to postgres using python

I have tried to dump csv data from multiple files to postgres using python.I succeed in doing so.But I want to check if a particular row already exists before copying into database postgres.Please check my code

 SQL_STATEMENT = """
    COPY %s FROM STDIN WITH
        CSV
        HEADER
        DELIMITER AS ','
    """

def process_file(conn, table_name, file_object):
    cursor = conn.cursor()
    cursor.
    cursor.copy_expert(sql=SQL_STATEMENT % table_name, file=file_object)
    conn.commit()
    cursor.close()


connection = psycopg2.connect("dbname=dataflow user=postgres host=localhost password=root")
try:
    process_file(connection, 'mytable', f)
finally:
    connection.close()

Please suggest me how to do it.

COPY just loads properly formatted data to a table - no preprocessing. Thus you can copy csv to temp table and then insert rows to you table skipping existing:

   CREATE TABLE temp_t AS SELECT * FROM table_name WHERE false
   ;
   COPY temp_t FROM STDIN WITH
        CSV
        HEADER
        DELIMITER AS ','
    ;
    INSERT INTO table_name 
      SELECT * 
      FROM temp_t
      EXCEPT
      SELECT * 
      FROM table_name
     ;

https://www.postgresql.org/docs/current/static/sql-copy.html

COPY FROM copies data from a file to a table (appending the data to whatever is in the table already)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM