简体   繁体   English

Python psycopg2游标

[英]Python psycopg2 cursors

From psycopg2 documentation: 从psycopg2文档中:

When a database query is executed, the Psycopg cursor usually fetches all the records returned by the backend, transferring them to the client process. 执行数据库查询时,Psycopg游标通常会获取后端返回的所有记录,并将它们转移到客户端进程。 If the query returned an huge amount of data, a proportionally large amount of memory will be allocated by the client. 如果查询返回了大量数据,则客户端将按比例分配大量内存。 If the dataset is too large to be practically handled on the client side, it is possible to create a server side cursor. 如果数据集太大而无法在客户端实际处理,则可以创建服务器端游标。

I would like to query a table with possibly thousands of rows and do some action for each one. 我想查询一个可能包含数千行的表,并对每个表执行一些操作。 Will normal cursors actually bring the entire data set on the client? 普通游标会真正将整个数据集带到客户端吗? That doesn't sound very reasonable. 听起来不太合理。 The code is something along the lines of: 该代码类似于以下内容:

conn = psycopg2.connect(url)
cursor = conn.cursor()
cursor.execute(sql)
for row in cursor:
    do some stuff
cursor.close()

I would expect this to be a streaming operation. 我希望这是一个流操作。 And a second question is regarding the scope of cursors. 第二个问题是关于游标的范围。 Inside my loop I would like to do an update of another table. 在我的循环中,我想更新另一个表。 Do I need to open a new cursor and close every time? 我是否需要打开一个新的光标并每次都关闭? Each item update should be in its own transaction as I might need to do a rollback. 每个项目更新都应在自己的事务中,因为我可能需要回滚。

for row in cursor:
    anotherCursor = anotherConn.cursor()
    anotherCursor.execute(update)
    if somecondition:
        anotherConn.commit()
    else:
        anotherConn.rollback
cursor.close()

======== EDIT: MY ANSWER TO FIRST PART ======== ========编辑:我对第一部分的回答========

Ok, I will try to answer the first part of my question. 好的,我将尝试回答问题的第一部分。 The normal cursors actually bring the entire data set as soon as you call execute, before even starting to iterate the result set. 普通游标实际上会在您调用execute之后立即带走整个数据集,甚至开始迭代结果集之前。 You can verify that by checking the memory footprint of the process at each step. 您可以通过在每个步骤检查进程的内存占用量来进行验证。 But the need for a server side cursor is actually due to postgres server and not the client, and is documented here: http://www.postgresql.org/docs/9.3/static/sql-declare.html 但是实际上需要服务器端游标是由于Postgres服务器而不是客户端,并且在此处进行了说明: http : //www.postgresql.org/docs/9.3/static/sql-declare.html

Now, this is not immediately apparent from the documentation, but such cursors can actually be temporarily created for the duration of the transaction. 现在,这在文档中还不是很明显,但是实际上可以在事务期间临时创建此类游标。 There is no need to explicitly create a function that returns a refcursor in the database, with the specific SLQ statement, etc. With psycopg2 you only need to give a name while obtaining the cursor and a temporary cursor will be created for that transaction. 无需使用特定的SLQ语句等显式创建返回数据库中的refcursor的函数。使用psycopg2,您仅需在获取游标时给出名称,便会为该事务创建一个临时游标。 So instead of: 所以代替:

 cursor = conn.cursor()

you just need to to: 您只需要:

 cursor = conn.cursor('mycursor')

That's it and it works. 就是这样,它起作用了。 I assume the same thing is done under the covers when using JDBC, when setting fetchSize. 我假设在使用JDBC和设置fetchSize的情况下,相同的操作都是在后台进行的。 It's just a bit more transparent. 它只是更加透明。 See docs here: https://jdbc.postgresql.org/documentation/head/query.html#query-with-cursor 在此处查看文档: https : //jdbc.postgresql.org/documentation/head/query.html#query-with-cursor

You can test that this works by querying the pg_cursors view inside the same transaction. 您可以通过查询同一事务内的pg_cursors视图来测试此方法是否有效。 The server side cursor appears after obtaining the client side cursor and disappears after closing the client side cursor. 获取客户端游标后,服务器端游标出现,而关闭客户端游标后,服务器端游标消失。 So bottom line: I'm happy to do that change to my code, but I must say this was a big gotcha for someone not that experienced with postgres. 因此,最重要的是:我很高兴对我的代码进行更改,但是我必须说,对于那些没有使用postgres的人来说,这是一个很大的难题。

Actually, you have already answered the question ;). 实际上,您已经回答了问题;)。

  1. Yes you should use server side cursor to get records streamed http://initd.org/psycopg/docs/usage.html#server-side-cursors 是的,您应该使用服务器端光标来获取流式传输的记录http://initd.org/psycopg/docs/usage.html#server-side-cursors

From docs: 从文档:

CREATE FUNCTION reffunc(refcursor) RETURNS refcursor AS $$
BEGIN
    OPEN $1 FOR SELECT col FROM test;
    RETURN $1;
END;
$$ LANGUAGE plpgsql;

And in code: 并在代码中:

cur1 = conn.cursor()
cur1.callproc('reffunc', ['curname'])

cur2 = conn.cursor('curname')
for record in cur2:     # or cur2.fetchone, fetchmany...
    # do something with record
    pass
  1. Yes you should open new cursor, if you wanna get rows with server side cursor. 是的,如果您想使用服务器端游标获取行,则应该打开新游标。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM