简体   繁体   中英

Understanding python cProfile output

My python scripts parses files sequentially, and makes simple data cleaning and writes to a new csv file. I'm using csv . the script is taking awfully long time to run.

cProfile output is as follows: 在此处输入图片说明

I have done a lot of googling before posting the question here.

link to the image image link

Adding code here, the function which is called

def db_insert(coCode, bse):
start = time()
q = []
print os.path.join(FILE_PATH, str(bse)+"_clean.csv");
f1 = open(os.path.join(FILE_PATH, str(bse)+"_clean.csv"))
reader = csv.reader(f1)
reader.next()
end = time()
# print end-start
for idx,row in enumerate(reader):
    ohlc = {}
    date = datetime.strptime( row[0], '%Y-%m-%d')
    date = row[0]
    row  = row[1:6]
    (op, high, low, close, volume) = row
    ohlc[date] = {}
    ohlc[date]['open'] = op
    ohlc[date]['high'] = high
    ohlc[date]['low'] = low
    ohlc[date]['close'] = close
    ohlc[date]['volume'] = volume
    q.append(ohlc)
end1 = time()
# print end1-end

db.quotes.insert({'bse':str(bse), 'quotes':q})
# print time()-end1
f1.close()
q = []
print os.path.join(FILE_PATH, str(coCode)+".csv");
f2 = open(os.path.join(FILE_PATH, str(bse)+"_clean.csv"))
reader = csv.reader(f2)
reader.next()
for idx,row in enumerate(reader):
    ohlc = {}
    date = datetime.strptime( row[0], '%Y-%m-%d')
    date = row[0]
    try:
        extra = row[7]+row[8]+row[9]
    except:
        try:
            extra = row[7]
        except:
            extra = ''
    row  = row[1:6]
    (op, high, low, close, volume) = row
    ohlc[date] = {}
    ohlc[date]['open'] = op
    ohlc[date]['high'] = high
    ohlc[date]['low'] = low
    ohlc[date]['close'] = close
    ohlc[date]['volume'] = volume
    ohlc[date]['extra'] = extra
    q.append(ohlc)
db.quotes_unadjusted.insert({'bse':str(bse), 'quotes':q})
f2.close()

I found this explanation in an answer by John Machin .

ncalls is relevant only to the extent that comparing the numbers against other counts such as number of chars/fields/lines in a file may highligh anomalies; tottime and cumtime is what really matters. cumtime is the time spent in the function/method including the time spent in the functions/methods that it calls; tottime is the time spent in the function/method excluding the time spent in the functions/methods that it calls.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM