[英]Python readline() and Counter causes MemoryError on very long line
I'm having the issue of a memory error. 我遇到了内存错误的问题。
pifile = 'pibillion.txt'
with open(pifile, "r+") as a:
data = str(a.readline())
c = Counter(data)
All my code does is read one very very large line of the digits of pi. 我所有的代码都是读取pi的非常大的一行。 The txt file is only 953 MB.
txt文件只有953 MB。 I have 8 GB RAM.
我有8 GB RAM。 I'm guessing the error is that it runs into the String size limitation but I'm not sure.
我猜错误是它遇到了字符串大小限制,但是我不确定。 The rest of the code inserts a line break at increments of two.
其余代码以2的增量插入一个换行符。 Any help would be greatly appreciated as to how to continue with this.
对于如何继续进行任何帮助,将不胜感激。
The exact error I'm getting is this: 我得到的确切错误是:
data = str(a.readline())
MemoryError
Python is not inherently lazy (like haskell), so reading a string will put it all in memory. Python并不是天生的懒惰(例如haskell),因此读取字符串会将其全部存储在内存中。 Add to that some string conversions and you're out of memory.
再加上一些字符串转换,您就没有内存了。 Instead, do this iteratively, like the following.
而是,像下面这样迭代地执行此操作。
Note that I have used a new file, as files are usually stored contiguously, so inserting is very expensive. 请注意,我使用了一个新文件,因为文件通常是连续存储的,因此插入非常昂贵。
with open('pibillion.txt', 'r') as old_file, open('pibillion_.txt', 'w') as new_file:
while True:
c = old_file.read(2)
if not c:
break
new_file.write(c + '\n')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.