繁体   English   中英

Python 脚本使用 pandas 写入 excel 文件

[英]Python script writing to excel file using pandas

新手,我从 a.dat 文件中提取了特定数据,并将其写入 excel 文件,但数据不完整。

所以这是.dat文件的内容:

03:53:56.172 Total: 683305, OK: 641643, NG: 39245, Retest: 2417 (Model: sdsd, LOT: asas, Recipe: qwqw, Worker ID: VID, Rework: 0)
04:37:09.070 Total: 751831, OK: 703329, NG: 45895, Retest: 2607 (Model: sdsd, LOT: asas, Recipe: qwqw, Worker ID: VID, Rework: 0)
05:07:19.632 Total: 751985, OK: 716798, NG: 35020, Retest: 167 (Model: sdsd, LOT: asas, Recipe: qwqw, Worker ID: VID, Rework: 1)
05:30:00.804 Total: 751946, OK: 720708, NG: 31115, Retest: 123 (Model: sdsd, LOT: asas, Recipe: qwqw, Worker ID: VID, Rework: 2) 

这是我用来提取我想要的内容的脚本:

import pandas as pd
file = open('C:/Users/user/OneDrive/Desktop/python/Event/Event_220816.dat')
for line in file:
    lst_time = [line.split()]
    time = [x[0] for x in lst_time]

    lst_input = [line.split()]
    input = [x[2] for x in lst_input]

    lst_output = [line.split()]
    output = [x[4] for x in lst_output]

    lst_lot = [line.split()]
    lot = [x[12] for x in lst_lot]

    lst_model = [line.split()]
    model = [x[10] for x in lst_model]

    lst_defects = [line.split()]
    defects = [x[6] for x in lst_defects]

    lst_retest = [line.split()]
    retest = [x[8] for x in lst_defects]

    #print('Time:', time, 'Lot No:', lot, 'Model:', model, 'Input:', input,
          'Output:', output, 'Defects:', defects, 'Retest:', retest)

    data_as_dictionary = {'Time': time, 'Lot No': lot, 'Model': model, 'Input': input,
                          'Output': output, 'Defects': defects, 'Retest': retest}
    df = pd.DataFrame(data_as_dictionary)
    df.to_excel('data.xlsx', index=None)

这是控制台 output 看起来像:

Time: ['03:53:56.172'] Lot No: ['asas,'] Model: ['sdsd,'] Input: ['683305,'] Output: ['641643,'] Defects: ['39245,'] Retest: ['2417']
Time: ['04:37:09.070'] Lot No: ['asas,'] Model: ['sdsd,'] Input: ['751831,'] Output: ['703329,'] Defects: ['45895,'] Retest: ['2607']
Time: ['05:07:19.632'] Lot No: ['asas,'] Model: ['sdsd,'] Input: ['751985,'] Output: ['716798,'] Defects: ['35020,'] Retest: ['167']
Time: ['05:30:00.804'] Lot No: ['asas,'] Model: ['sdsd,'] Input: ['751946,'] Output: ['720708,'] Defects: ['31115,'] Retest: ['123']

但是当涉及到 excel 文件时,它只写了最后一行:

Time           Lot No   Model   Input   Output Defects Retest
05:30:00.804    asas,   sdsd,   751946, 720708, 31115,  123

我希望控制台中显示的所有提取数据 output 将写入 excel 文件中,任何人都可以知道缺少什么,非常感谢。

正如@T_C Molenaar 提到的,您为每个循环创建一个新文件 a for loo。 您可能希望像这样 append :

with pd.ExcelWriter('output.xlsx',
                    mode='a') as writer:  
    df.to_excel(writer, sheet_name='Sheet_name_3')

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_excel.html

这个问题已经回答了,但是恕我直言,你的线路处理很奇怪。

我会这样做:

import pandas as pd

with open("somefile.dat") as infile:
    result = []
    for line in infile.read().splitlines():
        splitline = line.replace(",", "").split()  # remove comma here
        result.append(
            {
                "Time":    splitline[0],
                "Lot No":  splitline[12],
                "Model":   splitline[10],
                "Input":   splitline[2],
                "Output":  splitline[4],
                "Defects": splitline[6],
                "Retest":  splitline[8],
            }
        )

df = pd.DataFrame(result)
df.to_excel("data.xlsx", index=None)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM