简体   繁体   中英

Split a txt file into N lines each?

I would like to split a very large .txt file in to equal parts files each part containing N lines. and save it to a folder

with open('eg.txt', 'r') as T:
    while True:
        next_n_lines = islice(T, 300)
        f = open("split" + str(x.pop()) + ".txt", "w")
        f.write(str(next_n_lines))
        f.close()

But this creates a files with data

" < itertools.islice object at 0x7f8fa94a4940 >" 

in the txt files.

I would like to preserve the same structure and style maintained in the original txt file.

And this code does not terminate automatically when it reaches end of file as well. If possible I would the code to stop writing to files and quit if there is no data left to write.

You can use iter with islice , taking n lines at a time using enumerate to give your files unique names. f.writelines will write each list of lines to a new file:

with open('eg.txt') as T:
    for i, sli in enumerate(iter(lambda:list(islice(T, 300)), []), 1):
        with open("split_{}.txt".format(i), "w") as f:
            f.writelines(sli)

Your code loops forever as you don't include any break condition, using iter with an empty list will mean the loop ends when the iterator has been exhausted.

Also if you wanted to pass an islice object to be written you would just call writelines on it ie f.writelines(next_n_lines) , str(next_n_lines) .

The problem is tat itertools.islice returns an iterator and you are writing it's str in your file which is the representation of functions in python (showing the identity of object):

< itertools.islice object at 0x7f8fa94a4940 >

As a more pythinic way for slicing an iterator to equal parts, you can use following grouper function, which has been suggested by python wiki as itertools recipes :

def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return zip_longest(*args, fillvalue=fillvalue)

You can pass your file object as an iterator to function and then loop over the result and writ them to your file:

with open('eg.txt', 'r') as T:
    for partition in grouper(T,300):
        # do anything with `partition` like join the lines 
        # or any modification you like. Then write it in output.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM