简体   繁体   中英

How do I write sections of a file into separate lists in python 3.0?

I have a text file like this:

line 1
line 2
.
.
END OF SECTION 1, BEGIN SECTION 2
line 100
line 101
.
.
END OF SECTION 2, BEGIN SECTION 3
line 999
line 1000
.
.
END OF SECTION 3, BEGIN SECTION 4
END OF SECTION 4, BEGIN SECTION 5
line 5000
line 5001
.
.
END OF SECTION 5
Q

So, this file has 5 sections with a variable # of records/lines of data per section (the sections don't all have the same # of lines, some sections don't even have any data at all).

My task is to read this file and strip each section into a list (so in my example, I will end up with 5 separate lists), which will then be written out to an excel workbook made up of worksheets containing the lists. Thus, I want to end up with 5 lists that I am calling:

section_01_log
section_02_log
section_03_log
section_04_log
section_05_log

and then, my excel workbook will have these 5 tabs/worksheets in it.


For now, I'm struggling with the first part (ie, creating the lists) and would like some help. Once I get this, I will work on the second part, which is writing the lists to an excel workbook.

here's my code

#read the file into a list named "input_file" already defined
datafile = open(os.path.join(path,'filename'))
for line in datafile:
    input_file.append(line)
datafile.close()

# parse the "input_file" list and write only section 1
for line in input_file:
    if line.startswith('END OF SECTION 1'):
        exit
    else:
        section_01_log.append(line)

Unfortunately, this does not work. section_01_log keeps getting written with the entire content of input_file. Why? How do I just segregate the first section into section_01_log, and then do the same for all the other sections?

So you have your list right with this command:

myList = []
with open("test.txt", 'r') as fileopen:
    myList = [line.strip() for line in fileopen]
print (myList)

Output:

['line 1', 'line 2', 'END OF SECTION 1, BEGIN SECTION 2', 'line 100', 'line 101', 'END OF SECTION 2, BEGIN SECTION 3', 'line 999', 'line 1000', 'END OF SECTION 3, BEGIN SECTION 4', 'END OF SECTION 4, BEGIN SECTION 5', 'line 5000', 'line 5001', 'END OF SECTION 5']

If you want want to write in an excel file, i suggest to do this step by step:

  1. isolate each section in different lists
  2. create the excel file
  3. write your list in the excel file

1.An easy way of splitting your lists (it's not very clean, and shouldn't be reproduced if you have too many lists):

section1 = myList[0:myList.index("END OF SECTION 1, BEGIN SECTION 2")]
section2 = myList[myList.index("END OF SECTION 1, BEGIN SECTION 2")+1 : myList.index("END OF SECTION 2, BEGIN SECTION 3")]
section3 = myList[myList.index("END OF SECTION 2, BEGIN SECTION 3")+1 : myList.index("END OF SECTION 3, BEGIN SECTION 4")]
section4 = myList[myList.index("END OF SECTION 3, BEGIN SECTION 4")+1 : myList.index("END OF SECTION 4, BEGIN SECTION 5")]
section5 = myList[myList.index("END OF SECTION 4, BEGIN SECTION 5")+1 : myList.index("END OF SECTION 5")]

Basicly you just take the indexes to split the list. Easy right?

2.Create Excel file and create youor sheets. You'll need to import xlwt:

import xlwt

xl = xlwt.Workbook(encoding="utf-8")

section_01 = xl.add_sheet("section_01_log")
section_02 = xl.add_sheet("section_02_log")
section_03 = xl.add_sheet("section_03_log")
section_04 = xl.add_sheet("section_04_log")
section_05 = xl.add_sheet("section_05_log")

3.You write to the Excel file and you save :)

for i, r in enumerate(section1):
    section_01_log.write(i, 0, r)
for i, r in enumerate(section2):
    section_02_log.write(i, 0, r)
for i, r in enumerate(section3):
    section_03_log.write(i, 0, r)
for i, r in enumerate(section4):
    section_04_log.write(i, 0, r)
for i, r in enumerate(section5):
    section_05_log.write(i, 0, r)

xl.save("logs.xls")  

Like I said earlier there are cleaner ways of doing this with, but I'm a rookie...

The reason your code isn't working is because exit doesn't do what you think it does, assuming you want it to break out of the for-loop, in which case you want the break statement. exit is a built-in constant, which when called - like so: exit() - raises SystemExit , and is meant to be a convenient way to exit the interactive interpreter. Since you didn't call it, it simply evaluates to a string and your for-loop keeps going.

https://docs.python.org/2/library/constants.html#exit

The following approach should work and it works for a file that has more or less than 5 sections, as long as it is structured similarly to what you gave as an example, and it uses pretty basic, imperative python. I assume 'Q' is being used as a sentinal value to signal the end of the

with open('testing.txt') as f:
    log = {1:[]}
    i = 1
    new_section = False # flag to prevent creating sections just for sentinel
    for line in f:
        line = line.strip()

        if line == 'Q': # if we have reached the end of the file
            break
        elif new_section:
            i += 1
            log[i] = []
            new_section = False

        if line.startswith('END OF SECTION'):
            new_section = True
        else:
            log[i].append(line)

log is now a dictionary like this:

{1: ['line 1', 'line 2', 'line 3', 'line4'],
 2: ['line 100', 'line 101', 'line 102', 'line 103'],
 3: ['line 999', 'line 1000', 'line 1001', 'line 1003'],
 4: [],
 5: ['line 5000', 'line 5001', 'line 5002', 'line 5003']}

Which was made from this example text file:

line 1
line 2
line 3
line4
END OF SECTION 1, BEGIN SECTION 2
line 100
line 101
line 102
line 103
END OF SECTION 2, BEGIN SECTION 3
line 999
line 1000
line 1001
line 1003
END OF SECTION 3, BEGIN SECTION 4
END OF SECTION 4, BEGIN SECTION 5
line 5000
line 5001
line 5002
line 5003
END OF SECTION 5
Q

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM