简体   繁体   中英

Reading data file with varying number of columns python

I have a data-file the first 8 lines of which look like this. (after substituting actual values by letters for clarity of this question)

    a,b,c
    d
    e,f
    g,h
    i,j,k
    l
    m,n
    o,p

These represent data about transformers in an electric network. The first 4 lines are information about transformer 1, the next four about transformer 2 and so on.

The variables ap can are either integers, floating-point numbers or strings

I need to write a script in python so that that instead of data for one transformer being spread onto 4 lines, it should all be on one line.

More precisely, I would like the above 2 lines to be converted into

  a,b,c,d,e,f,g,h
  i,j,k,l,m,n,o,p

and write this to another data-file.
How do I do this?

If always 4 lines (number of fields in this lines are unimportant) are informations about one thing you could tho it so:

with open('your_data_file.txt', 'r') as i, open('output_file.txt', 'w') as o:
    new_info = 4
    for line in i:
        o.write(line.strip())  # use .strip() to remove new line character
        new_info -= 1
        if new_info == 0:
            o.write('\n')  # begin info of new transformer in new line
            new_info = 4
        else:
            o.write(',')  # write a , to separate the data fields, but not at
                          # the end of a line

In this code an input and an output file will be opened and always 4 lines of the input in one line of the output "converted" and written.

Use the grouper recipe from itertools

from itertools import izip_longest
def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
    args = [iter(iterable)] * n
    return izip_longest(fillvalue=fillvalue, *args)


with open('z.t') as f:
    d = grouper(f, 4)
    for x in d:
            print ','.join(y.rstrip() for y in x) 

a,b,c,d,e,f,g,h
i,j,k,l,m,n,o,p

Assuming this data pattern persists throughout the entire input file...

First, you'll need to read the file containing the data ( filename is a string; the path to the file)

f = open(filename, "r")   # open in read mode
content = f.read()        # read everything as one string
f.close()

Once you've read the contents of the file in as a string ( content ), it's just a matter of gathering all the data, dividing it and then re-forming it.

Assuming each transformer is associated with 8 values;

content = content.replace('\n', ',')   # put everything on one line
values = content.split(',')            # split it all up

lines = []
for i in range(0, len(values), 8):          # iterate by 8 elements
    lines.append(",".join(values[i:i+8]))   # merge these values and add to lines

output = "\n".join(lines)                   # merge these lines (via new lines)

You would then proceed to write output to file;

f = open(newfile, "w")  # open the new file in write mode; it doesn't have to exist yet
f.write(output)
f.close()

How about this:

import itertools

# From itertools recipes
def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
    args = [iter(iterable)] * n
    return itertools.izip_longest(fillvalue=fillvalue, *args)

with open('output', 'w+') as fout:
    with open('filename') as fin:
        fout.writelines(','.join(tup) + '\n' for tup in
            grouper(itertools.chain.from_iterable(
                line.strip().split(',') for line in fin), 8, '-'))

This chains together all the fields in all the lines as a single iterable, and then groups them into chunks of 8, and then writes them out to the new file.

This recipe doesn't care how many columns are on each line -- it could change throughout the file, even. It just takes them as consecutive 8-tuples

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM