Extracting a specific PIECE of a line from a textfile(Python)

Question

I have a textfile with the format:

3rd Year:

MECN3010 PREREQ MECN2011 COREQ TIMES1 TIMES2 MO3, MO4, FR5, TH1, TH2

MECN3012 PREREQ MECN2012 COREQ TIMES1 TUA, WE3, TH1, TH2 TIMES2

How can i extract just a particular part of a line?

For eg suppose I want to extract just the

PREREQ MECN 2011

part from the 2nd line.

I'm able to read the particular line I want in, but I don't know how to split / strip just the info I need.

Answer 1

If all the lines you are interested in contain PREREQ MECNYYYY where YYYY is the year number, you can use a regular expression like in the following:

EDIT: corrected the code

import re
# assume that line holds your text line
regex = ur'PREREQ MECN\d{4}'
matcher = re.search(re.compile(regex), line)
    if (matcher):
        match = matcher.group() #gives the actual match

Answer 2

Try this. You can use split and join .

lines = '''3rd Year:
MECN3010 PREREQ MECN2011 COREQ TIMES1 TIMES2 MO3, MO4, FR5, TH1, TH2
MECN3012 PREREQ MECN2012 COREQ TIMES1 TUA, WE3, TH1, TH2 TIMES2'''

for line in lines.splitlines()[1:]:
    print " ".join(line.split()[1:3])

Answer 3

Lets say you've found the line you're interested in:

line = "MECN3010 PREREQ MECN2011 COREQ TIMES1 TIMES2 MO3, MO4, FR5, TH1, TH2"

You have a few ways to extract a given field:

1) Token-based

>>> tokens = line.split()
>>> tokens
['MECN3010', 'PREREQ', 'MECN2011', 'COREQ', 'TIMES1', 'TIMES2', 'MO3,', 'MO4,', 'FR5,', 'TH1,', 'TH2']
>>> tokens[2]
'MECN2011'
>>> tokens[5]
'TIMES2'

Basically, you first split the line into a list of tokens (here done with split() ), then select the one you are interested in with basic list indexing.

If you're interested in multiple tokens, you can slice them out and re-join them:

>>> ' '.join(tokens[1:3])
'PREREQ MECN2011'

2) Position-based

>>> line[16:24]
'MECN2011'
>>> line[38:44]
'TIMES2'

If the parts of the line you are looking for are at a known offset from the beginning of the line, you can use the iterable slicing syntax.

3) Regex

>>> re.search(r'(TIMES\d)', line).groups()
('TIMES1',)
>>> re.findall(r'TIMES\d', line)
['TIMES1', 'TIMES2']

This is a bit more advanced, and full coverage of this is outside the scope, but here's the documentation .

Extracting a specific PIECE of a line from a textfile(Python)

Question

3 answers

solution1
0 2014-04-28 12:12:14

solution2
0 2014-04-28 12:30:11

solution3
0 ACCPTED 2014-04-28 12:30:50

Extracting a specific PIECE of a line from a textfile(Python)

Question

3 answers

solution1 0 2014-04-28 12:12:14

solution2 0 2014-04-28 12:30:11

solution3 0 ACCPTED 2014-04-28 12:30:50

solution1
0 2014-04-28 12:12:14

solution2
0 2014-04-28 12:30:11

solution3
0 ACCPTED 2014-04-28 12:30:50