Using Python to extract data from text table

Question

I have a programme which outputs a large data file. I want to extract a specific part of the data which looks like this:

BREAKDOWN BY GROUP

GROUP       DECAY CONST (S**-1)   FRACTION         REL FRACTION
       1    0.012467              61.02546E-06           0.01756
       2    0.028292             605.84209E-06           0.17430
       3    0.042524             194.54899E-06           0.05597
       4    0.133042             549.34727E-06           0.15805
       5    0.292467               1.03141E-03           0.29673
       6    0.666488             490.61031E-06           0.14115
       7    1.634781             353.35857E-06           0.10166
       8    3.554600             189.75023E-06           0.05459

RELATIVE FRACTION

I want to write a function which:

Searches through the output file for the keyword "BREAKDOWN"
Reads the data between "BREAKDOWN" and the next keyword "RELATIVE"
Extracts only the data, not the headings.

I only actually need the data in the first column (DECAY CONST) but I think it would be easier to read the whole lot in and then just use the first column rather than try to ignore the other data at this stage.

I have defined these two keywords as start_identifier and end_identifier as I would like to be able to reuse this code for other similar purposes. So far this is what I have:

def read_data_from_file(file_name, start_identifier, end_identifier):
    list_of_results = []
    with open(file_name) as f:
            t=f.read()
            t=t[t.find(start_identifier):]
            t=t[t.find(start_identifier):t.find(end_identifier)]
            t=t.replace('\n', '').split()
            t=[float(i) for i in t if not i.isidentifier()]
            list_of_results.extend(t)
    return(list_of_results)

Any help most gratefully appreciated!

Answer 1

Just playing but here's one way:

from io import StringIO

data = '''\
BREAKDOWN BY GROUP

GROUP       DECAY CONST (S**-1)   FRACTION         REL FRACTION
       1    0.012467              61.02546E-06           0.01756
       2    0.028292             605.84209E-06           0.17430
       3    0.042524             194.54899E-06           0.05597
       4    0.133042             549.34727E-06           0.15805
       5    0.292467               1.03141E-03           0.29673
       6    0.666488             490.61031E-06           0.14115
       7    1.634781             353.35857E-06           0.10166
       8    3.554600             189.75023E-06           0.05459

RELATIVE FRACTION 
'''
f = StringIO(data)

START, STOP = 'BREAKDOWN', 'RELATIVE'
SLICES = [
    slice(0, 8),
    slice(8, 31),
    slice(31, 46),
    slice(46, 65),
]

begin = False
lines = []
for line in f:
    if line.startswith(START):
        begin = True
        continue

    if line.startswith(STOP):
        begin = False
        continue

    if begin:
        if not line.strip():
            continue

        lines.append(line)

values = []
for line in lines[1:]:
    values.append([float(line[s].strip()) for s in SLICES])

columns = list(zip(*values))

print(columns[1])

Using Python to extract data from text table

Question

1 answers

solution1
1 ACCPTED 2021-01-13 14:04:06

Using Python to extract data from text table

Question

1 answers

solution1 1 ACCPTED 2021-01-13 14:04:06

solution1
1 ACCPTED 2021-01-13 14:04:06