简体   繁体   中英

How to speed up my python code working with DB?

I have a question about working with huge amount of data. I am working with Google Big Query (i don't think it is a problem of this DB) and need to SELECT data from one table, change it (using python) and then INSERT to another table. Could you tell me, how can i speed up these operations. I use the for loop for each row of my SELECT command. And working with only 15k rows is very long-time process. Maybe multithreading or some libraries could help me to do EXACTLY the same function to all of my >15k rows in DB. Thanks.

Missing some details about the process (which DB server is it?)

Anyway, The best approach would be:

  • Fetch by buffering: dbChunk = "DB Cursor".fetchmany(buffer_size)
  • Change the data in Python Data structures (LIST) ==> dbChunk2
  • Load into second table, using "DB Cursor".executemany(InsertString, dsChunk2 )
    • dsChunk2 is the updated LIST item where data was fetched into ([ (...), (...), ... ])

you can tune the buffer_size to get the best results. (start with 1000,I think)

Note: InserString should be included Columns, Values and bind variables - match to the Select statement.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM