![](/img/trans.png)
[英]Python easy way to set default encoding for opening files in text mode?
[英]Running into issues opening/encoding a text files in python
這是原始文本:
Issue / Problem Encountered Solution / Lessons
• Sample result on the print out was reported with a
“sample not seen” message indication
• Symbol character (*, ?) next to the sample value
result
• Impact :
– Wrong result / NC generation
– Downtime and delay in Lot disposition
Check for print out errors like:
• If an error is displayed for example:
“sample not seen” refer to SOP- 013499 and repeat sample.
• Sample result should not have an
interrogation mark before the sample value.
• Impact to the area:
– Minimize possible OOS results – Minimize NC
– Reduce cost for OOS
Investigation
Always ensure to verify that the print out report:
Does not has the message “sample not seen” and the symbol
“sample not
seen” message
Sample result should not have an interrogation mark before the sample value
characters on the sample value result
現在,我使用以下代碼來處理數據:
for ix, f in enumerate(listdir(directory_learning_group)):
if isfile(join(directory_learning_group,f)):
if "OPL" in f:
try:
dataset_outer_folder_OPL.loc[ix, "ID"] = f.split('_')[0]
dataset_outer_folder_OPL.loc[ix, "Filename"] = f
# Open a file
fd = io.open(directory_learning_group+'{}'.format(f), encoding = 'utf8', errors = 'ignore')
# Reading text
ret = fd.read()
dataset_outer_folder_OPL.loc[ix, "Text"] = ret
except:
print(f)
dataset_learning_group_OPL= dataset_learning_group_OPL.reset_index(drop = True)
並得到以下結果:
'A\x00M\x00L\x00 \x006\x00 \x00P\x00U\x00R\x00 \x00O\x00n\x00e\x00-\x00P\x00o\x00i\x00n\x00t\x00 \x00L\x00e\x00s\x00s\x00o\x00n\x00:\x00 \x00I\x00n\x00c\x00o\x00r\x00r\x00e\x00c\x00t\x00 \x00E\x00n\x00d\x00o\x00t\x00o\x00x\x00i\x00n\x00 \x00r\x00e\x00s\x00u\x00l\x00t\x00 \x00o\x00n\x00 \x00t\x00h\x00e\x00 \x00p\x00r\x00i\x00n\x00t\x00 \x00o\x00u\x00t\x00 \x00r\x00e\x00p\x00o\x00r\x00t\
我無法理解這里到底發生了什么。 該.txt看起來與我可以正常讀取的其他文件沒有什么不同。
即使我們嘗試對其進行解碼/編碼,也完全沒有幫助。
任何幫助/指導將不勝感激。
您可能應該將整個代碼發布到問題中。 無論如何,我用您發布的內容測試了原始文本文件,它適用於Python 3.x上的以下代碼:
with open('10020_OPL Endotoxin testing.txt', 'rb') as f:
file = f.readlines()
print(file)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.