[英]Python script writing to excel file using pandas
新手,我从 a.dat 文件中提取了特定数据,并将其写入 excel 文件,但数据不完整。
所以这是.dat文件的内容:
03:53:56.172 Total: 683305, OK: 641643, NG: 39245, Retest: 2417 (Model: sdsd, LOT: asas, Recipe: qwqw, Worker ID: VID, Rework: 0)
04:37:09.070 Total: 751831, OK: 703329, NG: 45895, Retest: 2607 (Model: sdsd, LOT: asas, Recipe: qwqw, Worker ID: VID, Rework: 0)
05:07:19.632 Total: 751985, OK: 716798, NG: 35020, Retest: 167 (Model: sdsd, LOT: asas, Recipe: qwqw, Worker ID: VID, Rework: 1)
05:30:00.804 Total: 751946, OK: 720708, NG: 31115, Retest: 123 (Model: sdsd, LOT: asas, Recipe: qwqw, Worker ID: VID, Rework: 2)
这是我用来提取我想要的内容的脚本:
import pandas as pd
file = open('C:/Users/user/OneDrive/Desktop/python/Event/Event_220816.dat')
for line in file:
lst_time = [line.split()]
time = [x[0] for x in lst_time]
lst_input = [line.split()]
input = [x[2] for x in lst_input]
lst_output = [line.split()]
output = [x[4] for x in lst_output]
lst_lot = [line.split()]
lot = [x[12] for x in lst_lot]
lst_model = [line.split()]
model = [x[10] for x in lst_model]
lst_defects = [line.split()]
defects = [x[6] for x in lst_defects]
lst_retest = [line.split()]
retest = [x[8] for x in lst_defects]
#print('Time:', time, 'Lot No:', lot, 'Model:', model, 'Input:', input,
'Output:', output, 'Defects:', defects, 'Retest:', retest)
data_as_dictionary = {'Time': time, 'Lot No': lot, 'Model': model, 'Input': input,
'Output': output, 'Defects': defects, 'Retest': retest}
df = pd.DataFrame(data_as_dictionary)
df.to_excel('data.xlsx', index=None)
这是控制台 output 看起来像:
Time: ['03:53:56.172'] Lot No: ['asas,'] Model: ['sdsd,'] Input: ['683305,'] Output: ['641643,'] Defects: ['39245,'] Retest: ['2417']
Time: ['04:37:09.070'] Lot No: ['asas,'] Model: ['sdsd,'] Input: ['751831,'] Output: ['703329,'] Defects: ['45895,'] Retest: ['2607']
Time: ['05:07:19.632'] Lot No: ['asas,'] Model: ['sdsd,'] Input: ['751985,'] Output: ['716798,'] Defects: ['35020,'] Retest: ['167']
Time: ['05:30:00.804'] Lot No: ['asas,'] Model: ['sdsd,'] Input: ['751946,'] Output: ['720708,'] Defects: ['31115,'] Retest: ['123']
但是当涉及到 excel 文件时,它只写了最后一行:
Time Lot No Model Input Output Defects Retest
05:30:00.804 asas, sdsd, 751946, 720708, 31115, 123
我希望控制台中显示的所有提取数据 output 将写入 excel 文件中,任何人都可以知道缺少什么,非常感谢。
正如@T_C Molenaar 提到的,您为每个循环创建一个新文件 a for loo。 您可能希望像这样 append :
with pd.ExcelWriter('output.xlsx',
mode='a') as writer:
df.to_excel(writer, sheet_name='Sheet_name_3')
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_excel.html
这个问题已经回答了,但是恕我直言,你的线路处理很奇怪。
我会这样做:
import pandas as pd
with open("somefile.dat") as infile:
result = []
for line in infile.read().splitlines():
splitline = line.replace(",", "").split() # remove comma here
result.append(
{
"Time": splitline[0],
"Lot No": splitline[12],
"Model": splitline[10],
"Input": splitline[2],
"Output": splitline[4],
"Defects": splitline[6],
"Retest": splitline[8],
}
)
df = pd.DataFrame(result)
df.to_excel("data.xlsx", index=None)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.