[英]Redirecting the print output to a .txt file in Python
我是Python的入門者。 我已經嘗試過從stackoverflow答案中解決此問題的許多方法,但是它們都不在我的腳本中起作用。
我有這個小腳本可以使用,但是我無法將龐大的結果保存到.txt文件中,因此我可以分析數據。 如何將打印輸出重定向到計算機上的txt文件?
from nltk.util import ngrams
import collections
with open("text.txt", "rU") as f:
sixgrams = ngrams(f.read().decode('utf8').split(), 2)
result = collections.Counter(sixgrams)
print result
for item, count in sorted(result.iteritems()):
if count >= 2:
print " ".join(item).encode('utf8'), count
只需在命令行上執行: python script.py > text.txt
Python 2.x中的print
語句支持重定向( >> fileobj
):
...
with open('output.txt', 'w') as f:
print >>f, result
for item, count in sorted(result.iteritems()):
if count >= 2:
print >>f, " ".join(item).encode('utf8'), count
在Python 3.x中, print
函數接受可選的關鍵字參數file
:
print("....", file=f)
如果您在python 2.6+中from __future__ import print_function
進行操作,則即使在Python 2.x中也可以使用上述方法。
使用BufferedWriter可以像這樣
os = io.BufferedWriter(io.FileIO(pathOut, "wb"))
os.write( result+"\n")
for item, count in sorted(result.iteritems()):
if count >= 2:
os.write(" ".join(item).encode('utf8')+ str(count)+"\n")
outs.flush()
outs.close()
正如Antti所提到的,您應該更喜歡python3並將所有煩人的python2垃圾都拋在腦后。 以下腳本適用於python2和python3。
要讀取/寫入文件,請使用io模塊中的open
函數,它與python2 / python3兼容。 始終使用with
語句來打開文件之類的資源。 使用with
可以在Python Context Manager中包裝塊的執行。 文件描述符具有上下文mananger實現,將在離開with
塊時自動關閉。
不依賴於python,如果要讀取文本文件,則應該知道此文件的編碼才能正確讀取(如果不確定,請先嘗試utf-8
)。 此外,正確的UTF-8簽名是utf-8
並且使用了模式U
#!/usr/bin/env python
# -*- coding: utf-8; mode: python -*-
from nltk.util import ngrams
import collections
import io, sys
def main(inFile, outFile):
with io.open(inFile, encoding="utf-8") as i:
sixgrams = ngrams(i.read().split(), 2)
result = collections.Counter(sixgrams)
templ = "%-10s %s\n"
with io.open(outFile, "w", encoding="utf-8") as o:
o.write(templ % (u"count", u"words"))
o.write(templ % (u"-" * 10, u"-" * 30))
# Sorting might be expensive. Before sort, filter items you don't want
# to handle, btw. place *count* in front of the tuple.
filtered = [ (c, w) for w, c in result.items() if c > 1]
filtered.sort(reverse=True)
for count, item in filtered:
o.write(templ % (count, " ".join(item)))
if __name__ == '__main__':
sys.exit(main("text.txt", "out_text.txt"))
使用輸入的text.txt
文件:
At eight o'clock on Thursday morning and Arthur didn't feel very good
he missed 100 € on Thursday morning. The Euro symbol of 100 € is here
to test the encoding of non ASCII characters, because encoding errors
do occur only on Thursday morning.
我得到以下output_text
:
count words
---------- ------------------------------
3 on Thursday
2 Thursday morning.
2 100 €
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.