简体   繁体   中英

Pyodbc execute takes long time on a lot of rows

I have the following code and I'm trying to read a very big table that has over 100M rows on MariaDB. In theory execute is just going to set the cursor and then whenever I iterate over a row it's going to fetch it or at least this is what it says on the docs.

import pyodbc

cnxn  = pyodbc.connect('DRIVER=/usr/lib/libmaodbc.so;socket=/var/run/mysqld/mysqld.sock;Database=101m;User=root;Password=123;Option=3;')
cursor = cnxn.cursor()
cursor.execute("select * from vat")
for row in cursor:
  print(row)

I tried following versions of the code but with no results.

import pyodbc

cnxn  = pyodbc.connect('DRIVER=/usr/lib/libmaodbc.so;socket=/var/run/mysqld/mysqld.sock;Database=101m;User=root;Password=123;Option=3;')
with cnxn.cursor() as cursor:
  cursor.execute("select * from vat")
  for row in cursor:
    print(row)
import pyodbc

cnxn  = pyodbc.connect('DRIVER=/usr/lib/libmaodbc.so;Server=127.0.0.1;Database=101m;User=root;Password=123;Option=3;') # tcp instead of unix socket
with cnxn.cursor() as cursor:
  cursor.execute("select * from 101m") # another big table
  for row in cursor:
    print(row)

Update: Even without the for loop the execute itself takes a long time. And what I'm trying to do is copying data from MariaDb server to a sqlite database.

According to your ODBC DSN server and client are running on the same machine.

In your comment you mentioned that you want to move 100GB with a single select. This will require a lot of memory

  • 100 GB for client.network buffer
  • 100 GB for client row buffer

Additionally, a lot of GB on the server side will be used to prepare and send the result (Note that the server is running on the same machine). This will lead to memory problems (swapping) and slow down your machine.

I would recommend to fetch the data in portions and to use mariadb,pymysql or mysqldb module instead.

Example (without any exception handling):

import mariadb
conn= mariadb.connect(db="101m", user="root", password="123", host="localhost")

cursor= conn.cursor()
cursor.execute("SELECT COUNT(*) FROM vat")
total= cursor.fetchone()[0]
   
# This value depends on your system configuration 
records = 2048

for i in range(0, total, step):
    cursor.execute("select * from vat LIMIT ?, records", (i, step))
    rows= cursor.fetchall()
    # process result

Another solution would be to use server side cursors, however this solution is only supported by mariadb module:

cursor= conn.cursor(cursor_type=CURSOR.READ_ONLY, prefetch_size=4096)
cursor.execute(SELECT * from vat")
for row in cursor:
  #process row

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM