簡體   English   中英

如何在 python 的特定場景中忽略 csv 分隔符?

[英]How to ignore a csv delimiter on specific scenarios in python?

我正在嘗試使用 CSV 文件在 DB 中插入數據。

import psycopg2 #import the postgres library
#connect to the database
conn = psycopg2.connect(host='1.11.11.111',
                   dbname='postgres',
                   user='postgres',
                   password='myPassword',
                   port='1234')  
#create a cursor object 
#cursor object is used to interact with the database
cur = conn.cursor()
#open the csv file using python standard file I/O
#copy file into the table just created 
with open("C:/Users/Harshal/Desktop/tar.csv", 'r') as f:
next(f) 
cur.copy_from(f, 'geotargets_india',sep=',')
conn.commit()
conn.close()
f.close()

我的表如下:

create table public.geotargets_india(
Criteria_ID integer not null,
Name character varying(50) COLLATE pg_catalog."default" NOT NULL,
Canonical_Name character varying(100) COLLATE pg_catalog."default" NOT NULL,
Parent_ID NUMERIC(10,2),
Country_Code character varying(10) COLLATE pg_catalog."default" NOT NULL,
Target_Type character varying(50) COLLATE pg_catalog."default" NOT NULL,
Status character varying(50) COLLATE pg_catalog."default" NOT NULL
)

我的 CSV 看起來像:

CSVIMG

我得到的錯誤是: 呃 如果仔細查看我的 csv 行,例如: 1007740,Hyderabad,"Hyderabad,Telangana,India",9061642.0,IN,City,Active 在這里, Canonical_Name有“,”分隔的字符串,這會導致錯誤,並假設 CSV 中的列多於表。 如何解決這個問題? 注意:我假設錯誤只是由於這個。 CSV 鏈接

foo.csv:

It is header which will be ignored------------------------------------
1007740,Hyderabad,"Hyderabad,Telangana,India",9061642.0,IN,City,Active

Python:

import psycopg2
conn = psycopg2.connect('')
cur = conn.cursor()
f = open('foo.csv', 'r')
cur.copy_expert("""copy geotargets_india from stdin with (format csv, header, delimiter ',', quote '"')""", f)
conn.commit()

psql:

table geotargets_india;
┌─────────────┬───────────┬───────────────────────────┬────────────┬──────────────┬─────────────┬────────┐
│ criteria_id │   name    │      canonical_name       │ parent_id  │ country_code │ target_type │ status │
├─────────────┼───────────┼───────────────────────────┼────────────┼──────────────┼─────────────┼────────┤
│     1007740 │ Hyderabad │ Hyderabad,Telangana,India │ 9061642.00 │ IN           │ City        │ Active │
└─────────────┴───────────┴───────────────────────────┴────────────┴──────────────┴─────────────┴────────┘

您可能應該在 Python 中自己閱讀和解析 CSV 文件,然后使用INSERT語句將數據加載到數據庫中。

import csv
import psycopg2

conn = psycopg2.connect(
    host='1.11.11.111',
    dbname='postgres',
    user='postgres',
    password='myPassword',
    port='1234'
)  
cur = conn.cursor()

with open("tar.csv") as fd:
    rdr = csv.DictReader(fd)
    cur.executemany("""
        INSERT INTO geotargets_india
        VALUES (%(Criteria_ID)s, %(Name)s, %(Canonical_Name)s, %(Parent_ID)s, %(Country_Code)s, %(Target_Type)s, %(Status)s);
        """,
        rdr
    )

cur.close()
conn.close()

對以上幾點評論。 csv.DictReader class 將返回 CSV 的字典。 The returned DictReader object, rdr , is iterable, so it can be used directly in psycopg2's cursor.executemany function, which is probably more efficient that iterating through the csv DictReader object yourself.

您對 Canonical_Name 中的問題是正確的。 我成功導入了行1007740,Hyderabad,"Hyderabad",9061642.0,IN,City,Active in table with your structure。

不幸的是,copy_from 方法不支持 csv 分隔符參數。 這是文檔https://www.psycopg.org/docs/cursor.html#cursor.copy_from

因此,您可以使用制表符分隔符重新格式化 csv 文件,然后使用 copy_from

import csv
import psycopg2 #import the postgres library
#connect to the database
conn = psycopg2.connect(host='1.11.11.111',
                   dbname='postgres',
                   user='postgres',
                   password='myPassword',
                   port='1234')
#create a cursor object
#cursor object is used to interact with the database
cur = conn.cursor()
#open the csv file using python standard file I/O
#copy file into the table just created

with open("C:/Users/Harshal/Desktop/tar.csv", 'r') as f:
    reader = csv.reader(f, delimiter=",")
    with open("C:/Users/Harshal/Desktop/tar.tsv", 'w') as tsv:
        writer = csv.writer(tsv, delimiter='\t')
        writer.writerows(reader)

with open("C:/Users/Harshal/Desktop/tar.tsv", 'r') as f:
    next(f)
    cur.copy_from(f, 'geotargets_india',sep='\t')
    conn.commit()
    conn.close()
    f.close()

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM