简体   繁体   中英

Zero-length delimiter error using python psycopg2 to insert csv data into a postgres database

I am attempting to load the contents of a few csv files into my postgres database using psycopg2. When I run the script, I get the following error:

psycopg2.errors.SyntaxError: zero-length delimited identifier at or near """"

(photo of traceback here)

I understand that the error is most likely due to the single quotes around the empty string value of 'example', but I don't know the reason this would cause an issue.

        df = pandas.read_csv(cip_location, header=0, encoding='ISO-8859-1', dtype=str)
        number_loaded_rows += len(df.index)
        for index, row in df.iterrows():
            row = row.squeeze()

            cip_code = row['CIPCode']
            cip_code = cip_code[cip_code.find('"') + 1:cip_code.rfind('"')]
            if cip_code.startswith('0'):
                cip_code = cip_code[1:]
            cip_title = row['CIPTitle']
            cip_def = row['CIPDefinition']

            exam_string = row['Examples']
            exam_string = exam_string.replace('Examples:', '').replace(' - ', '').replace(' -', '')
            examples = exam_string

            cip_codes[cip_code] = {
                'code': cip_code,
                'title': cip_title,
                'definition': cip_def,
                'examples': examples
            }

        with gzip.GzipFile(ending_location, 'r') as f:
            bytes = f.read()
            string = bytes.decode('utf-8')
            loaded_unis = jsonpickle.decode(string)
        print('Finished loading in ' + str(time.time() - start_load))

        import psycopg2

        cnx = psycopg2.connect('host=localhost dbname=postgres user=postgres password=password')
        count = 0
        cursor = cnx.cursor()
        for d in cip_codes.values():
            print('Inserted: %s' % count)
            print('Trying to insert (%s, "%s", "%s", "%s");' % (d['code'], d['title'], d['definition'], d['examples']))
            cursor.execute('CALL InsertCIP(%s, "%s", "%s", "%s");' % (str(d['code']), d['title'].replace('"', "'"),
                                                                      d['definition'].replace('"', "'"),
                                                                      d['examples'].replace('"', "'")))
            count = count + 1
        cnx.commit()
        cursor.close()
        cnx.close()

The Gzip code does not seem to be doing anything sql related here.

In this case appears that the first row in your data is empty and the escaping is causing the 4th column of data to be """"

Try allowing the psycopg2 to do the escaping for you. If you have a lot of data, excute_batch is faster than looping over each row.

data = [tuple(r) for r in cip_codes.values]

cursor = cnx.cursor()

insert_sql = """InsertCIP(%s, %s, %s, %s)"""

execute_batch(cur, insert_sql, data, page_size=1000 )

cursor.commit()

cursor.close()

Hope that helps, im not sure what InsertCIP looks like.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM