脚本在执行期间停止

Question

我正在检查我的数据库中的数据迁移失败，虽然我的 Python 脚本适用于少量数据，但它目前在执行过程中停止。 cmd 仍在执行 state 但似乎在某些时候没有运行，我需要使用 Ctrl+C 手动中止它。

下面的代码和注释：

import collections
import csv

a=[]

with open('FailedIds.txt') as my_file:
    for line in my_file:
        a.append(line) #builds array of unique row IDs that failed in migration. Contains 680k rows.

with open("OldDbAll.txt", 'r') as f:
    l = list(csv.reader(f))
    dict = {i[0]:[(x) for x in i[1:]] for i in zip(*l)} #builds dictionary containing all rows and columns from our old DB, key = column header, values = arrays of values. Contains 3 million rows and 9 columns, 200MB in file size.

string=''
print("Done building dictionary")

with open('Fix.txt', 'w') as f:
  print(",".join(dict.keys()),file=f)
  for i in range(len(dict['UNIQUEID'])):
    for j in range(len(a)):
      if a[j].strip()==dict['UNIQUEID'][i]: #matching failure row ID to the dictionary unique ID array
        for key in dict:
          string+=dict[key][i]+"," #prints the data to be re-migrated
        print(string,file=f)
        string=''

当我第一次在一夜之间运行这个脚本时，在手动中止 python 脚本后，我得到了大约 50k 行。 我认为这没关系，因为我的电脑可能已经休眠。 然而，今天早上，在整个昨天到深夜运行脚本后，我得到了 1k 行。 我计划重新启动我的计算机并将其设置为下次不休眠，但我想将所有 600k+ 行作为 output，而目前我离这个数量还很远。

我四处搜索，Python 的数组大小限制应该远高于我使用它的大小，所以其他原因导致程序挂起。 任何想法将不胜感激！

Answer 1

我相信这个循环是你的代码需要这么长时间才能运行的原因：

for key in dict:
  string+=dict[key][i]+"," #prints the data to be re-migrated
print(string,file=f)
string=''

字符串连接很慢，这个循环做了很多。

我认为您根本不需要连接 - 只需像 go 一样写入文件：

for key in dict:
  f.write(dict[key][i]+",")

脚本在执行期间停止

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-06-16 15:33:17

脚本在执行期间停止

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-06-16 15:33:17

解决方案1
1 已采纳 2020-06-16 15:33:17