简体   繁体   English

使用python将数据从csv复制到postgresql

[英]copy data from csv to postgresql using python

I am on windows 7 64 bit.我在 Windows 7 64 位上。 I have a csv file 'data.csv'.我有一个 csv 文件“data.csv”。 I want to import data to a postgresql table 'temp_unicommerce_status' via a python script.我想通过 python 脚本将数据导入 postgresql 表“temp_unicommerce_status”。

My Script is:我的脚本是:

import psycopg2
conn = psycopg2.connect("host='localhost' port='5432' dbname='Ekodev' user='bn_openerp' password='fa05844d'")
cur = conn.cursor()
cur.execute("""truncate table "meta".temp_unicommerce_status;""")
cur.execute("""Copy temp_unicommerce_status from 'C:\Users\n\Desktop\data.csv';""")
conn.commit()
conn.close()

I am getting this error我收到此错误

Traceback (most recent call last):
  File "C:\Users\n\Documents\NetBeansProjects\Unicommerce_Status_Update\src\unicommerce_status_update.py", line 5, in <module>
cur.execute("""Copy temp_unicommerce_status from     'C:\\Users\\n\\Desktop\\data.csv';""")
psycopg2.ProgrammingError: must be superuser to COPY to or from a file
HINT:  Anyone can COPY to stdout or from stdin. psql's \copy command also works for anyone.

Use the copy_from cursor method使用copy_from游标方法

f = open(r'C:\Users\n\Desktop\data.csv', 'r')
cur.copy_from(f, temp_unicommerce_status, sep=',')
f.close()

The file must be passed as an object.该文件必须作为对象传递。

Since you are coping from a csv file it is necessary to specify the separator as the default is a tab character由于您正在处理 csv 文件,因此有必要指定分隔符,因为默认值是制表符

The way I solved this problem particular to use psychopg2 cursor class function copy_expert (Docs: http://initd.org/psycopg/docs/cursor.html ).我解决这个问题的方式特别是使用psychopg2 游标类函数copy_expert(文档: http ://initd.org/psycopg/docs/cursor.html)。 copy_expert allows you to use STDIN therefore bypassing the need to issue a superuser privilege for the postgres user. copy_expert 允许您使用 STDIN,因此无需为 postgres 用户颁发超级用户权限。 Your access to the file then depends on the client (linux/windows/mac) user's access to the file您对文件的访问取决于客户端 (linux/windows/mac) 用户对文件的访问

From Postgres COPY Docs ( https://www.postgresql.org/docs/current/static/sql-copy.html ):来自 Postgres COPY Docs ( https://www.postgresql.org/docs/current/static/sql-copy.html ):

Do not confuse COPY with the psql instruction \\copy.不要将 COPY 与 psql 指令 \\copy 混淆。 \\copy invokes COPY FROM STDIN or COPY TO STDOUT, and then fetches/stores the data in a file accessible to the psql client. \\copy 调用 COPY FROM STDIN 或 COPY TO STDOUT,然后在 psql 客户端可访问的文件中获取/存储数据。 Thus, file accessibility and access rights depend on the client rather than the server when \\copy is used.因此,当使用 \\copy 时,文件可访问性和访问权限取决于客户端而不是服务器。

You can also leave the permissions set strictly for access to the development_user home folder and the App folder.您还可以保留严格设置的权限以访问 development_user 主文件夹和 App 文件夹。

csv_file_name = '/home/user/some_file.csv'
sql = "COPY table_name FROM STDIN DELIMITER '|' CSV HEADER"
cursor.copy_expert(sql, open(csv_file_name, "r"))
#sample of code that worked for me

import psycopg2 #import the postgres library

#connect to the database
conn = psycopg2.connect(host='localhost',
                       dbname='database1',
                       user='postgres',
                       password='****',
                       port='****')  
#create a cursor object 
#cursor object is used to interact with the database
cur = conn.cursor()

#create table with same headers as csv file
cur.execute("CREATE TABLE IF NOT EXISTS test(**** text, **** float, **** float, **** 
text)")

#open the csv file using python standard file I/O
#copy file into the table just created 
with open('******.csv', 'r') as f:
next(f) # Skip the header row.
    #f , <database name>, Comma-Seperated
    cur.copy_from(f, '****', sep=',')
    #Commit Changes
    conn.commit()
    #Close connection
    conn.close()


f.close()

Here is an extract from relevant PostgreSQL documentation : COPY with a file name instructs the PostgreSQL server to directly read from or write to a file.以下是相关 PostgreSQL 文档的摘录:带有文件名的 COPY 指示 PostgreSQL 服务器直接读取或写入文件。 The file must be accessible to the server and the name must be specified from the viewpoint of the server.服务器必须可以访问该文件,并且必须从服务器的角度指定名称。 When STDIN or STDOUT is specified, data is transmitted via the connection between the client and the server当指定 STDIN 或 STDOUT 时,数据通过客户端和服务器之间的连接传输

That's the reason why the copy command to or from a file a restricted to a PostgreSQL superuser : the file must be present on server and is loaded directly by the server process.这就是为什么copy命令到文件 a 或从文件 a 限制为 PostgreSQL 超级用户的原因:该文件必须存在于服务器上并由服务器进程直接加载。

You should instead use :你应该改用:

cur.copy_from(r'C:\Users\n\Desktop\data.csv', temp_unicommerce_status)

as suggested by this other answer , because internally it uses COPY from stdin.正如另一个答案所建议的那样,因为它在内部使用来自标准输入的COPY

You can use d6tstack which makes this simple您可以使用d6tstack这使这变得简单

import d6tstack
import glob

c = d6tstack.combine_csv.CombinerCSV([r'C:\Users\n\Desktop\data.csv']) # single-file
c = d6tstack.combine_csv.CombinerCSV(glob.glob('*.csv')) # multi-file
c.to_psql_combine('postgresql+psycopg2://psqlusr:psqlpwdpsqlpwd@localhost/psqltest', 'tablename')

It also deals with data schema changes , create/append/replace table and allows you to preprocess data with pandas.它还处理 数据模式更改、创建/追加/替换表并允许您使用 Pandas 预处理数据。

I know this question has been answered, but here are my two cent.我知道这个问题已经得到回答,但这是我的两分钱。 I am adding little more description:我添加了更多描述:

You can use cursor.copy_from method :您可以使用cursor.copy_from方法:

First you have to create a table with same no of columns as your csv file.首先,您必须创建一个与 csv 文件具有相同列数的表。

Example:例子:

My csv looks like this:我的 csv 看起来像这样:

Name,       age , college , id_no , country , state   , phone_no

demo_name   22  , bdsu    , 1456  , demo_co , demo_da , 9894321_

First create a table:首先创建一个表:

import psycopg2
from psycopg2 import Error

connection = psycopg2.connect(user = "demo_user",
                                  password = "demo_pass",
                                  host = "127.0.0.1",
                                  port = "5432",
                                  database = "postgres")
cursor = connection.cursor()


create_table_query = '''CREATE TABLE data_set
(Name  TEXT NOT NULL ,
age  TEXT NOT NULL ,
college  TEXT NOT NULL ,
id_no TEXT NOT NULL ,
country TEXT NOT NULL ,
state TEXT NOT NULL ,
phone_no TEXT NOT NULL);'''

cursor.execute(create_table_query)
connection.commit()

Now you can simply use cursor.copy_from where you need three parameters :现在您可以简单地在需要三个参数的地方使用 cursor.copy_from :

first file object , second table_name , third sep type

you can copy now :你现在可以复制:

f = open(r'final_data.csv', 'r')
cursor.copy_from(f, 'data_set', sep=',')
f.close()

done完毕

I am going to post some of the errors I ran into trying to copy a csv file to a database on a linux based system....我将发布一些我在尝试将 csv 文件复制到基于 linux 的系统上的数据库时遇到的错误......

here is an example csv file:这是一个示例 csv 文件:

Name Age Height
bob  23   59
tom  56   67
  1. You must install the library psycopg2 (ie pip install psycopg2 or sudo apt install python3-psycopg2 )您必须安装库 psycopg2(即 pip install psycopg2 或 sudo apt install python3-psycopg2 )

  2. You must have postgres installed on your system before you can use psycopg2 (sudo apt install postgresql-server postgresql-contrib )您必须先在系统上安装 postgres,然后才能使用 psycopg2(sudo apt install postgresql-server postgresql-contrib)

  3. Now you must create a database to store the csv unless you already have postgres setup with a pre-existing database现在您必须创建一个数据库来存储 csv,除非您已经使用预先存在的数据库设置了 postgres

COPY CSV USING POSTGRES COMMANDS使用 POSTGRES 命令复制 CSV

  • After installing postgres it creates a default user account which gives you access to postgres commands安装 postgres 后,它会创建一个默认用户帐户,让您可以访问 postgres 命令

  • To switch to the postgres account issue: sudo -u postgres psql切换到 postgres 帐户问题: sudo -u postgres psql

  • Acess the prompt by issuing: psql通过发出以下命令访问提示: psql

    #command to create a database create database mytestdb; #command 创建数据库 create database mytestdb; #connect to the database to create a table \\connect mytestdb; #连接数据库创建表\\connect mytestdb; #create a table with same csv column names create table test(name char(50), age char(50), height char(50)); #创建一个具有相同 csv 列名的表 create table test(name char(50), age char(50), height char(50)); #copy csv file to table copy mytestdb 'path/to/csv' with csv header; #copy csv file to table copy mytestdb 'path/to/csv' with csv header;

COPY CSV USING PYTHON The main issue I ran into with copying the CSV file to a database was I didn't have the database created yet, however this can be done with python still.使用 PYTHON 复制 CSV我在将 CSV 文件复制到数据库时遇到的主要问题是我还没有创建数据库,但是这仍然可以用 python 完成。

import psycopg2 #import the Postgres library

#connect to the database
conn = psycopg2.connect(host='localhost',
                       dbname='mytestdb',
                       user='postgres',
                       password='')  
#create a cursor object 
#cursor object is used to interact with the database
cur = conn.cursor()

#create table with same headers as csv file
cur.execute('''create table test(name char(50), age char(50), height char(50));''')

#open the csv file using python standard file I/O
#copy file into the table just created 
f = open('file.csv','r')
cursor.copy_from(f, 'test', sep=',')
f.close()
import os
if '__main__' == __name__:
    cmd = "PGPASSWORD=pwd psql -h host -p port -U username -d dbname -c 'copy tablename from stdin' < '{}'".format(filepath)
    os.system(cmd)

Try to do the same as the root user - postgres.尝试与 root 用户执行相同的操作 - postgres。 If it were linux system, you could change file's permissions or move the file to /tmp.如果是 linux 系统,您可以更改文件的权限或将文件移动到 /tmp。 The problem results from missing credentials to read from the filesystem.该问题是由于缺少从文件系统读取的凭据造成的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM