简体   繁体   中英

Script is taking too long

so here is my script:

import glob,os, csv    
from sqlalchemy import *
count = 0
served_imsi = []
served_imei = []
served_msisdn = []
location_area_code = []
routing_area = []
cell_identity = []
service_area_code = []
s_charging_characteristics = []
plmn_id = []

path = '/home/cneps/cdr/*.cdr'
for file in glob.glob(path):
    f = open(file)
    for lines in f:
        served_imsi.append(lines[17:17+16])
        served_imei.append(lines[47:47+16])
        served_msisdn.append(lines[65:65+18])
        sgsn_address.append(lines[83:83+32])
        ggsn_address.append(lines[115:115+32])
        charging_id.append(lines[147:147+10])    
        apn_network.append(lines[157:157+63])
        location_area_code.append(lines[296:296+4])
        routing_area.append(lines[300:300+2])
        cell_identity.append(lines[302:302+4])
        service_area_code.append(lines[306:306+4])
        s_charging_characteristics.append(lines[325:325+2])
        plmn_id.append(lines[327:327+6])

db = create_engine('sqlite:///TIM_CDR.db',echo=True)
metadata = MetaData(db)
CDR1 = Table('CDR1', metadata, autoload=False)
i = CDR1.insert()

while count < len(served_imei):

    i.execute(Served_IMSI=served_imsi[count], Served_IMEI=served_imei[count], Served_MSISDN=served_msisdn[count], SGSN_Address=sgsn_address[count], GGSN_Address=ggsn_address[count], Charging_ID=charging_id[count], APN_Network=apn_network[count], LAC=location_area_code[count], RAC=routing_area[count], Cell_Identity=cell_identity[count], Service_Area_Code=service_area_code[count], S_Charging_Characteristics=s_charging_characteristics[count], PLMN_ID=plmn_id[count])
    count += 1

It is taking quite a lot to finish since these data I'm inserting into my database are like 100k lines.

It's taking 30 minutes to complete.

I've already read about it, and I know I probably should use transactions, but I don't really know how to do it.

Can anyone make an example for me inside my code how to use transaction to commit after everything?

That would be great, thanks.

See the test_alchemy_core function in this answer: Why is SQLAlchemy insert with sqlite 25 times slower than using sqlite3 directly? It shows you how to execute multiple inserts in one batch. Your problem is, that you're executing one insert after another, which is always very slow.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM