简体繁体中英

Executing a large query with psycopg2

原文 2019-03-30 14:58:29 2 1 python/ postgresql/ psycopg2

I'm trying to execute a large select query (about 50 000 000 from 200 000 000 rows, 15 columns) and fetch all of this data to pandas data frame using psycopg2. In pgadmin server status tool i can see, that my query is active for about half an hour and then become idle. I read it means that server is waiting for a new command. On the other hand, my python script still don't have data and it waiting for them too (there is no errors, it looks like data are downloading).

To sum up, database is waiting, python is waiting, should I still waiting? Is there a chance for happy ending? Or python is not able to process that big amount od data?

1 answers

Holy smokes, Batman! If your query takes more than a few minutes to execute, you ought to think of a different way to process your data! If you are returning 200 000 000 rows of 15 single-byte columns, this is already 3 gigabytes of raw data, assuming not a single byte of overhead, which is very unlikely. If those columns are 64-bit integers instead, that is already 24 gigabytes. This is a lot of in-memory data to handle for Python.

Have you considered what happens if your process fails during execution, or if the connection is interrupted? Your program will benefit from processing rows of data in chunks, if it is possible for your process. If it really is not possible, consider approaches that operate on the database itself, such as using PL/pgSQL.

Executing SQL query with psycopg2

psycopg2 leaking memory after large query

psycopg2 sql.Identifier fails executing when query is valid

PostgresDB with psycopg2 query executing successfully but sometimes updating the record and sometimes not

Psycopg2 uses up memory on large select query

Executing a sql file in psycopg2

python's psycopg2 package and dbeaver client returning different results when executing the same query. Is this a known issue with psycopg2?

Psycopg2 parameterized execute query

Query String Composition in Psycopg2

psycopg2 query parameter types

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Executing SQL query with psycopg2 psycopg2 leaking memory after large query psycopg2 sql.Identifier fails executing when query is valid PostgresDB with psycopg2 query executing successfully but sometimes updating the record and sometimes not Psycopg2 uses up memory on large select query Executing a sql file in psycopg2 python's psycopg2 package and dbeaver client returning different results when executing the same query. Is this a known issue with psycopg2? Psycopg2 parameterized execute query Query String Composition in Psycopg2 psycopg2 query parameter types

Related Tags

Executing a large query with psycopg2

Question

1 answers

solution1 1 ACCPTED 2019-03-30 15:11:10

solution1
1 ACCPTED 2019-03-30 15:11:10