简体   繁体   中英

Parse text file in Python

I have txt file and I want to learn how to parse txt file in Python.

txt file:

April 2011
05.05.2013 8:30 20:50

(here I can have different data)

How can I parse this file and have all data in separate variable?

example output:

month = "April 2011"
mydate = "05.05.2013"
time_start = "8:30"
time_stop = "20:50"

The key here is to first analyze your input file format and what you want out of it. Let's consider your input data:

April 2011
05.05.2013 8:30 20:50

What do we have here?

The first line has the Month and the year separated by a space. If you want "April 2011" together as a separate Python label (variable), you can just read the entire file using readlines() method and the first item of the list will be "April 2011".

Next line, we have the date and two time "fields" each separated by a space. As per your output requirements, you want each of these in separate Python labels (variables). So, just reading the second line is not sufficient for you. You will have to separate each of the "fields" above. The split() method will prove useful here. Here is an example:

>>> s = '05.05.2013 8:30 20:50'
>>> s.split()
['05.05.2013', '8:30', '20:50']

As you can see, now you have the fields separated as items of a list. You can now easily assign them to separate labels (or variables).

Depending on what other data you have, you should try and attempt a similar approach of first analyzing how you can get the data you need from each line of the file.

with open('file') as f:
  tmp = f.read()
  tmp2 = f.split('\n')
  month = tmp2[0]
  tmp = tmp2[1].split(' ')
  mydata = tmp[0]
  time_start = tmp[1]
  time_stop = tmp[2]
with open("Input.txt") as inputFile:
    lines = [line for line in inputFile]
    month, mydate, time_start, time_stop = [lines[0].strip()] + lines[1].strip().split()
    print month, mydate, time_start, time_stop

Output

April 2011 05.05.2013 8:30 20:50

file a.txt;

April 2011
05.05.2013 8:30 20:50
May 2011
08.05.2013 8:32 21:51
June 2011
05.06.2013 9:30 23:50
September 2011
05.09.2013 18:30 20:50

python code;

import itertools

my_list = list()
with open('a.txt') as f:
    for line1,line2 in itertools.izip_longest(*[f]*2):
        mydate, time_start, time_stop = line2.split()
        my_list.append({
            'month':line1.strip(),
            'mydate': mydate,
            'time_start': time_start,
            'time_stop': time_stop,
        })

print(my_list)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM