简体   繁体   English

从多个txt文件读取-剥离数据并保存到xls

[英]reading from multiple txt files - strip data and save to xls

i'm very new to python, so far i have written the following code below, which allows me to search for text files in a folder, then read all the lines from it, open an excel file and save the read lines in it. 我是python的新手,到目前为止,我已经在下面编写了以下代码,该代码使我可以在文件夹中搜索文本文件,然后从中读取所有行,打开一个excel文件并在其中保存读取的行。 (Im still unsure whether this does it for all the text files one by one) Having run this, i only see the file text data being read and saved into the excel file (first column). (我仍然不确定是否对所有文本文件一个接一个地执行)。运行此命令后,我仅看到文件文本数据正在读取并保存到excel文件中(第一列)。 Or it could be that it is overwriting the the data from multiple text files into the same column until it finishes. 或者可能是将多个文本文件中的数据覆盖到同一列中,直到完成为止。 Could anyone point me in the right direction on how to get it to write the stripped data to the next available column in excel through each text file? 谁能为我指出正确的方向,如何通过每个文本文件将剥离的数据写入excel中的下一个可用列?

import os
import glob

list_of_files = glob.glob('./*.txt')

for fileName in list_of_files:
    fin = open( fileName, "r" )
    data_list = fin.readlines()
    fin.close() # closes file

    del data_list[0:17] 
    del data_list[1:27] # [*:*]

    fout = open("stripD.xls", "w")
    fout.writelines(data_list)
    fout.flush()
    fout.close()

Can be condensed in 可以凝结在

import glob

list_of_files = glob.glob('./*.txt')

with open("stripD.xls", "w") as fout:
    for fileName in list_of_files:
        data_list = open( fileName, "r" ).readlines()
        fout.write(data_list[17])
        fout.writelines(data_list[44:])

Are you aware that writelines() doesn't introduce newlines ? 您是否知道writelines()不会引入换行符? readlines() keeps newlines during a reading, so there are newlines present in the elements of data_list written in the file by writelines() , but this latter doesn't introduce newlines itself readlines()在读取过程中保留换行符,因此writelines()在文件中写入的data_list元素中存在换行符,但后者不会引入换行符

您可能想检查一下 ,对于简单的需求,也可以使用csv

These lines are "interesting": 这些行是“有趣的”:

del data_list[0:17] 
del data_list[1:27] # [*:*]

You are deleting as many of the first 17 lines of your input file as exist, keeping the 18th (if it exists), deleting another 26 (if they exist), and keeping any following lines. 您将删除输入文件中前17 中的所有 ,保留第18行(如果存在),删除另外26行(如果存在),并保留以下任何行。 This is a very unusual procedure, and is not mentioned at all in your description of what you are trying to do. 这是一个非常不寻常的过程,在您要执行的操作的描述中根本没有提及。

Secondly, you are writing the output lines (if any) from each to the same output file. 其次,您将每个输出行(如果有)写入相同的输出文件。 At the end of the script, the output file will contain data from only the last input file. 在脚本末尾,输出文件将仅包含来自最后一个输入文件的数据。 Don't change your code to use append mode ... opening and closing the same file all the time just to append records is very wasteful, and only justified if you have a real need to make sure that the data is flushed to disk in case of a power or other failure. 请勿将代码更改为使用追加模式 ...仅在追加记录时始终打开和关闭同一文件是非常浪费的,只有在确实需要确保将数据刷新到磁盘中时才需要这样做停电或其他故障。 Open your output file once, before you start reading files, and close it once when you have finished with all the input files. 在开始读取文件之前,请一次打开输出文件,并在完成所有输入文件后将其关闭一次。

Thirdly, any old arbitrary text file doesn't become an "excel file" just because you have named it "something.xls". 第三,任何旧的任意文本文件都不会因为您将其命名为“ something.xls”而成为“ excel文件”。 You should write it with the csv module and name it "something.csv". 您应该使用csv模块编写它,并将其命名为“ something.csv”。 If you want more control over how Excel will interpret it, write an xls file using xlwt. 如果要进一步控制Excel的解释方式,请使用xlwt编写一个xls文件。

Fourthly, you mention "column" several times, but as you have not given any details about how your input lines are to be split into "columns", it is rather difficult to guess what you mean by "next available column". 第四,您多次提到“列”,但是由于您未提供有关如何将输入行拆分为“列”的任何详细信息,因此很难猜测“下一个可用列”的含义。 It is even possible to suspect that you are confusing columns and rows ... assuming less than 43 lines in each input file, the 18th ROW of the last input file will be all you will see in the output file. 甚至有可能怀疑您混淆了行和列...假设每个输入文件中的行数少于43行,则最后一个输入文件的第18行将是您在输出文件中看到的全部。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM