简体   繁体   中英

Split text file (line by line) to different files

Looking for data splitter line by line, by using python

  • RegEx?
  • Contain?

As example file "file" contain:

X
X
Y
Z
Z
Z

I need the clean way to split this file into 3 different ones, based on letter

As a sample:

def split_by_platform(FILE_NAME):

    with open(FILE_NAME, "r+") as infile:
        Data = infile.read()
        If the file contains "X"
            write to x.txt
        If the file contains "Y"
            write to y.txt
        If the file contains "Z"
            write to z.txt

x.txt file will look like:

X
X

y.txt file will look like:

Y

z.txt file will look like:

Z
Z
Z

EDIT thanks to @bruno desthuilliers, who reminded me of the correct way to go here:

Iterate over the file object (not 'readlines'):

def split_by_platform(FILE_NAME, out1, out2, out3):

    with open(FILE_NAME, "r") as infile, open(out1, 'a') as of1, open(out2, 'a') as of2, open(out3, 'a') as of3:
        for line in infile:
            if "X" in line:
                of1.write(line)
            elif "Y" in line:
                of2.write(line)
            elif "Z" in line:
                of3.write(line)

EDIT on a hint of @dim: Here the more general approach for an arbitrary length list of flag chars:

def loop(infilename, flag_chars):
    with open(infilename, 'r') as infile:
        for line in infile:
            for c in flag_chars:
                if c in line:
                    with open(c+'.txt', 'a') as outfile:
                        outfile.write(line)            

This should do it:

with open('my_text_file.txt') as infile, open('x.txt', 'w') as x, open('y.txt', 'w') as y, open('z.txt', 'w') as z:
    for line in infile:
        if line.startswith('X'):
            x.write(line)
        elif line.startswith('Y'):
            y.write(line)
        elif line.startswith('Z'):
            z.write(line)

Here is a more generic way to do the same job:

from collections import Counter

with open("file.txt", "r+") as file:
    data = file.read().splitlines()
    counter = Counter(data)
    array2d = [[key, ] * value for key, value in counter.items()]
    print array2d # [['Y'], ['X', 'X'], ['Z', 'Z', 'Z']]
    for el in array2d:
        with open(str(el[0]) + ".txt", "w") as f:
            [f.write(e + "\n") for e in el]

The above code will generate X.txt , Y.txt and Z.txt with the corresponding values. If you have for example several C letters the code will generate a file C.txt .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM