[英]Fetching data from postgres database in batch (python)
我有以下 Postgres 查询,我从 table1 中获取行 ~2500 万行的数据,并希望将以下查询的输出写入多个文件。
query = """ WITH sequence AS (
SELECT
a,
b,
c
FROM table1 )
select * from sequence;"""
下面是获取完整数据集的python脚本。 如何修改脚本以将其提取到多个文件(例如,每个文件有 10000 行)
#IMPORT LIBRARIES ########################
import psycopg2
from pandas import DataFrame
#CREATE DATABASE CONNECTION ########################
connect_str = "dbname='x' user='x' host='x' " "password='x' port = x"
conn = psycopg2.connect(connect_str)
cur = conn.cursor()
conn.autocommit = True
cur.execute(query)
df = DataFrame(cur.fetchall())
谢谢
这里有3种方法可能会有所帮助
片段
with conn.cursor(name='fetch_large_result') as cursor:
cursor.itersize = 20000
query = "SELECT * FROM ..."
cursor.execute(query)
for row in cursor:
....
片段
conn = psycopg2.connect(conn_url)
cursor = conn.cursor(name='fetch_large_result')
cursor.execute('SELECT * FROM <large_table>')
while True:
# consume result over a series of iterations
# with each iteration fetching 2000 records
records = cursor.fetchmany(size=2000)
if not records:
break
for r in records:
....
cursor.close() # cleanup
conn.close()
最后你可以定义一个 SCROLL CURSOR
片段
BEGIN MY_WORK;
-- Set up a cursor:
DECLARE scroll_cursor_bd SCROLL CURSOR FOR SELECT * FROM My_Table;
-- Fetch the first 5 rows in the cursor scroll_cursor_bd:
FETCH FORWARD 5 FROM scroll_cursor_bd;
CLOSE scroll_cursor_bd;
COMMIT MY_WORK;
请注意不在 psycopg2 中命名游标将导致游标在客户端而不是服务器端。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.