腳本在執行期間停止

Question

我正在檢查我的數據庫中的數據遷移失敗，雖然我的 Python 腳本適用於少量數據，但它目前在執行過程中停止。 cmd 仍在執行 state 但似乎在某些時候沒有運行，我需要使用 Ctrl+C 手動中止它。

下面的代碼和注釋：

import collections
import csv

a=[]

with open('FailedIds.txt') as my_file:
    for line in my_file:
        a.append(line) #builds array of unique row IDs that failed in migration. Contains 680k rows.

with open("OldDbAll.txt", 'r') as f:
    l = list(csv.reader(f))
    dict = {i[0]:[(x) for x in i[1:]] for i in zip(*l)} #builds dictionary containing all rows and columns from our old DB, key = column header, values = arrays of values. Contains 3 million rows and 9 columns, 200MB in file size.

string=''
print("Done building dictionary")

with open('Fix.txt', 'w') as f:
  print(",".join(dict.keys()),file=f)
  for i in range(len(dict['UNIQUEID'])):
    for j in range(len(a)):
      if a[j].strip()==dict['UNIQUEID'][i]: #matching failure row ID to the dictionary unique ID array
        for key in dict:
          string+=dict[key][i]+"," #prints the data to be re-migrated
        print(string,file=f)
        string=''

當我第一次在一夜之間運行這個腳本時，在手動中止 python 腳本后，我得到了大約 50k 行。 我認為這沒關系，因為我的電腦可能已經休眠。 然而，今天早上，在整個昨天到深夜運行腳本后，我得到了 1k 行。 我計划重新啟動我的計算機並將其設置為下次不休眠，但我想將所有 600k+ 行作為 output，而目前我離這個數量還很遠。

我四處搜索，Python 的數組大小限制應該遠高於我使用它的大小，所以其他原因導致程序掛起。 任何想法將不勝感激！

Answer 1

我相信這個循環是你的代碼需要這么長時間才能運行的原因：

for key in dict:
  string+=dict[key][i]+"," #prints the data to be re-migrated
print(string,file=f)
string=''

字符串連接很慢，這個循環做了很多。

我認為您根本不需要連接 - 只需像 go 一樣寫入文件：

for key in dict:
  f.write(dict[key][i]+",")

腳本在執行期間停止

問題描述

1 個解決方案

解決方案1
1 已采納 2020-06-16 15:33:17

腳本在執行期間停止

問題描述

1 個解決方案

解決方案1 1 已采納 2020-06-16 15:33:17

解決方案1
1 已采納 2020-06-16 15:33:17