简体   繁体   中英

Optimizing the python code for better performance

I have the following code which search ID from one table and insert into the other table. The GENRETB consists about 2 million records and the MOVIETB consists about 800,000 records. The code works fine but it is very slow. Need help to optimize and improve the performance of this piece of code.

import sqlite3

conn = sqlite3.connect('movieDB.db')
print ("Opened database successfully");

cursor = conn.execute("SELECT MOVIENAME FROM GENRETB")

for row in cursor:
    mname = row[0]
    print(mname)
    cursor2 = conn.execute("SELECT ID FROM MOVIETB WHERE MOVIENAME = ?",(mname,))
    for row2 in cursor2:
        mid = row2[0]
        print(mid)
        conn.execute ("UPDATE GENRETB SET ID = ? WHERE MOVIENAME = ?",(mid,mname))

conn.commit()
conn.close()

Thanks in advance

It is slow, because the inner query will run many times. Eventually you are doing the join in python. Better to use join in sql.

Eg:

cursor = conn.execute("SELECT GENRETB.MOVIENAME, MOVIETB.MOVIENAME, 
MOVIETB.MID FROM GENRETB JOIN MOVIETB ON MMOVIETB.OVIENAME = GENRETB.MOVIENAME")

You can even do the update with one execution instead of many updates. Eg:

UPDATE GENRETB SET ID = (SELECT MID FROM MOVIETB WHERE MOVIENAME = GENRETB.MOVIENAME)

Maybe you must change that, because I don't know your database.

I don't know whether printing mid is needed. If yes, you can query only that, and so it will be much faster. If not, you really don't need to run loop, only one update statement.

Another question is whether it is a good idea what you try to do. It depends on the relationship between your tables.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM