简体   繁体   中英

output data into text file as data is streaming in for python

the following code reads in a file line by line. What would be the most efficient way to output each lines in to a text file (output.txt) as the lines are being Read in from the input file?

fileHandle = open('file', 'r')

for line in fileHandle:
    fields = line.split('|')

    print(fields[0]) # prints the first fields value
    print(fields[1]) # prints the second fields value

fileHandle.close()

The code above was found in Parsing a pipe delimited file in python

An efficient way is to use generators with context managers to handle the files. The context manager takes care for closing the file. The generator will yield one line at a time, instead of building a temp list first.

with open('read_file', 'r') as reader:
    with open('output_file', 'w') as writer:
        gen = (line.split('|') for line in reader)
        for row in gen:
            writer.write(row)

Here is the pandas version where we can do this in a very readable way:

import pandas as pd
df = pd.read_csv('infile.csv', sep="|")
df.iloc[:,:2].to_csv('outfile.csv', sep="|", index=False)

The key here is:

df.iloc[:,:2] # selects the first two columns

Example by creating a file-like object:

import io
s = u"""A|B|C
1|2|3
4|5|6"""

file = io.StringIO(s)

import pandas as pd
df = pd.read_csv(file, sep="|")
output = df.iloc[:,:2].to_csv(sep="|", index=False)
print(output)

Returns:

A|B
1|2
4|5

This is how I do it, I open an input file on 'read' and an output file on 'write'. I got this answer from a Python textbook. It says python can handle multiple files being open at once. Just make sure to add the reference to the print statement "file=outfile". Then of course close both files.

infile = open(infileName,'r')
outfile = open(outfileName,'w')

for line in infile:
    fields = line.split('|')

    print(fields[0], file=outfile)
    print(fields[1], file=outfile)

infile.close()
outfile.close()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM