简体   繁体   中英

reading from multiple txt files - strip data and save to xls

i'm very new to python, so far i have written the following code below, which allows me to search for text files in a folder, then read all the lines from it, open an excel file and save the read lines in it. (Im still unsure whether this does it for all the text files one by one) Having run this, i only see the file text data being read and saved into the excel file (first column). Or it could be that it is overwriting the the data from multiple text files into the same column until it finishes. Could anyone point me in the right direction on how to get it to write the stripped data to the next available column in excel through each text file?

import os
import glob

list_of_files = glob.glob('./*.txt')

for fileName in list_of_files:
    fin = open( fileName, "r" )
    data_list = fin.readlines()
    fin.close() # closes file

    del data_list[0:17] 
    del data_list[1:27] # [*:*]

    fout = open("stripD.xls", "w")
    fout.writelines(data_list)
    fout.flush()
    fout.close()

Can be condensed in

import glob

list_of_files = glob.glob('./*.txt')

with open("stripD.xls", "w") as fout:
    for fileName in list_of_files:
        data_list = open( fileName, "r" ).readlines()
        fout.write(data_list[17])
        fout.writelines(data_list[44:])

Are you aware that writelines() doesn't introduce newlines ? readlines() keeps newlines during a reading, so there are newlines present in the elements of data_list written in the file by writelines() , but this latter doesn't introduce newlines itself

您可能想检查一下 ,对于简单的需求,也可以使用csv

These lines are "interesting":

del data_list[0:17] 
del data_list[1:27] # [*:*]

You are deleting as many of the first 17 lines of your input file as exist, keeping the 18th (if it exists), deleting another 26 (if they exist), and keeping any following lines. This is a very unusual procedure, and is not mentioned at all in your description of what you are trying to do.

Secondly, you are writing the output lines (if any) from each to the same output file. At the end of the script, the output file will contain data from only the last input file. Don't change your code to use append mode ... opening and closing the same file all the time just to append records is very wasteful, and only justified if you have a real need to make sure that the data is flushed to disk in case of a power or other failure. Open your output file once, before you start reading files, and close it once when you have finished with all the input files.

Thirdly, any old arbitrary text file doesn't become an "excel file" just because you have named it "something.xls". You should write it with the csv module and name it "something.csv". If you want more control over how Excel will interpret it, write an xls file using xlwt.

Fourthly, you mention "column" several times, but as you have not given any details about how your input lines are to be split into "columns", it is rather difficult to guess what you mean by "next available column". It is even possible to suspect that you are confusing columns and rows ... assuming less than 43 lines in each input file, the 18th ROW of the last input file will be all you will see in the output file.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM