[英]How to split a very long line into several lines on Python?
I needed to add "\\n" to each tag in my xml so I can look through the file normally(about 300,000 lines in the file merged into 1 (opened with EmEditor and 16 lines were displayed)) 我需要在xml中的每个标签中添加“ \\ n”,这样我才能正常浏览文件(文件中约300,000行合并为1(用EmEditor打开,显示了16行))
However, when I try to read file and replace the tags, it gives out Memory Error 但是,当我尝试读取文件并替换标签时,它会发出内存错误
for line in open('file.xml', encoding='UTF-8'):
main_line = line.replace('<root>', '\n<root>')
with open('the_file.xml', 'a', encoding='UTF-8') as x:
x.write(main_line)
There is no copy of the data, and pressing 300,000 times replace makes no sense. 没有数据的副本,按300,000次replace是没有意义的。
Can I edit a file and bypass Memory error on Python? 我可以编辑文件并绕过Python上的内存错误吗?
I did some searching and found this answer to a similar question: How to solve the memory error in Python . 我进行了一些搜索,找到了一个类似问题的答案: 如何解决Python中的内存错误 。 TLDR: You are probably running out of RAM.
TLDR:您的RAM可能已用完。 Install 64 bit python or use a database like sqlite3 as the user ShadowRanger suggested.
根据用户ShadowRanger的建议,安装64位python或使用sqlite3之类的数据库。 I hope this is somewhat helpful.
我希望这会有所帮助。
You're re-opening your output file once per loop iteration. 每个循环迭代一次,您将重新打开输出文件。 That's unnecessary and it may be contributing to your out-of-memory issue.
这是不必要的,它可能会导致您的内存不足问题。 Consider opening the file only once instead:
考虑只打开一次文件:
with open('input_file.xml', 'r', encoding='UTF-8') as input_file, open(
'output_file.xml', 'w', encoding='UTF-8') as output_file:
for line in input_file:
output_file.write(line.replace('<root>', '\n<root>'))
Or just use sed
: 或者只是使用
sed
:
sed 's/<root>/\n<root>/g' input_file > output_file
a = "file.xml.xml"
b = "the_file.xml"
with open(a, 'r', encoding='utf-8') as input, open(b, 'w', encoding='utf-8') as out:
for line in input:
main = line.replace('<root>', '\n<root>')
out.write(main)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.