繁体   English   中英

批量从postgres数据库中获取数据(python)

[英]Fetching data from postgres database in batch (python)

我有以下 Postgres 查询,我从 table1 中获取行 ~2500 万行的数据,并希望将以下查询的输出写入多个文件。

query = """ WITH sequence AS (
                SELECT 
                        a,
                        b,
                        c
                FROM table1 )                    

select * from sequence;"""

下面是获取完整数据集的python脚本。 如何修改脚本以将其提取到多个文件(例如,每个文件有 10000 行)

#IMPORT LIBRARIES ########################
import psycopg2
from pandas import DataFrame

#CREATE DATABASE CONNECTION ########################
connect_str = "dbname='x' user='x' host='x' " "password='x' port = x"
conn = psycopg2.connect(connect_str)
cur = conn.cursor()
conn.autocommit = True

cur.execute(query)
df = DataFrame(cur.fetchall())

谢谢

这里有3种方法可能会有所帮助

  1. 使用psycopg2 命名游标 cursor.itersize = 2000

片段

 with conn.cursor(name='fetch_large_result') as cursor:

    cursor.itersize = 20000

    query = "SELECT * FROM ..."
    cursor.execute(query)

    for row in cursor:
....
  1. 使用psycopg2 命名游标 fetchmany(size=2000)

片段

conn = psycopg2.connect(conn_url)
cursor = conn.cursor(name='fetch_large_result')
cursor.execute('SELECT * FROM <large_table>')

while True:
    # consume result over a series of iterations
    # with each iteration fetching 2000 records
    records = cursor.fetchmany(size=2000)

    if not records:
        break

    for r in records:
        ....

cursor.close() #  cleanup
conn.close()

最后你可以定义一个 SCROLL CURSOR

  1. 定义滚动光标

片段

BEGIN MY_WORK;
-- Set up a cursor:
DECLARE scroll_cursor_bd SCROLL CURSOR FOR SELECT * FROM My_Table;

-- Fetch the first 5 rows in the cursor scroll_cursor_bd:

FETCH FORWARD 5 FROM scroll_cursor_bd;
CLOSE scroll_cursor_bd;
COMMIT MY_WORK;

请注意不在 psycopg2 中命名游标将导致游标在客户端而不是服务器端。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM