简体   繁体   English

如何阅读文本文件,然后使用python将其拆分为多个文本文件?

[英]How do you read a text file, then split that text file into multiple text files with python?

I'm given a text file daily. 我每天收到一个文本文件。 I need to do a couple things to the text file. 我需要对文本文件做几件事。

  1. I need to insert a line break every 181 characters. 我需要每181个字符插入一个换行符。

  2. I need to read the text file line by line and send individual lines to new text files. 我需要逐行阅读文本文件,并将个别行发送到新的文本文件。 These files can contain different data types per line, which for my system is unusable. 这些文件每行可以包含不同的数据类型,这对于我的系统是无法使用的。 For example, I might get a file tomorrow that has 250 lines of data, containing 6 different data types. 例如,明天我可能会得到一个包含250行数据的文件,其中包含6种不同的数据类型。 The data types are determined by the first four letters of the line. 数据类型由该行的前四个字母确定。 I need to read each line, and if the line starts with ABC1, send it to text file "ABC1.txt". 我需要阅读每一行,如果该行以ABC1开头,请将其发送到文本文件“ ABC1.txt”。 The next iteration and all lines that start with ABC1 need to be appeneded into the same "ABC1.txt" file. 下一次迭代和以ABC1开头的所有行都必须添加到同一“ ABC1.txt”文件中。 If the line starts with "ABC2" send it to text file ABC2. 如果该行以“ ABC2”开头,则将其发送到文本文件ABC2。

In the end I need to take original_file.txt, and split it up into ABC1.txt, ABC2.txt, ABC3.txt, ABC4.txt. 最后,我需要获取original_file.txt,并将其拆分为ABC1.txt,ABC2.txt,ABC3.txt,ABC4.txt。

I'm new to programming and I'm fiddling around with it. 我是编程的新手,但是我很喜欢它。 Currently I can open the file and read it, and I can print it to a new file. 目前,我可以打开文件并阅读,也可以将其打印到新文件中。 I haven't figured out how to sort the lines into the lines I need, then send those to the new text file, then repeat that for the other file types. 我还没有弄清楚如何将行排序为所需的行,然后将其发送到新的文本文件,然后对其他文件类型重复该过程。 I've done a lot of googling and watched a lot of videos but none seem to do what I'm trying to do, they're all pretty generic. 我做了很多谷歌搜索,也看了很多视频,但是似乎都没有我想要做的,它们都很普通。

I would also like to figure out how to turn this fixed length document into a csv, but that would just be icing on the cake. 我还想弄清楚如何将此固定长度的文档转换为csv,但这只是锦上添花。

The first thing you are going to need to learn is handling files in Python. 您需要学习的第一件事是使用Python处理文件。 This link is a good start for beginners: 此链接对于初学者来说是一个好的开始:

http://www.pythonforbeginners.com/files/reading-and-writing-files-in-python http://www.pythonforbeginners.com/files/reading-and-writing-files-in-python

A csv file is just a file with Comma Separated Values and should be easy enough to figure out when you have some basic Python file handling under your belt. 一个csv文件只是一个带有逗号分隔值的文件,应该很容易弄清楚何时可以处理一些基本的Python文件。 Open a csv file in a text editor to see what I mean. 在文本编辑器中打开一个csv文件,以了解我的意思。

Use a to append the files each time use if/elif checks to see what each line startswith. 每次使用if / elif检查每行的开头时,都使用a附加文件。

with open('original_file.txt', 'r') as infile,open("ABC1.txt","a") as ab1,open("ABC2.txt","a") as ab2,\
    open("ABC3.txt","a") as ab3,open("ABC4.txt","a") as ab4:
        for line in infile:
            if line.startswith("ABC1"):
                ab1.write(line)
            elif line.startswith("ABC2"):
                ab2.write(line)
            elif line.startswith("ABC3"):
                ab3.write(line)
            elif line.startswith("ABC4"):
                ab4.write(line) 

If you want to insert something every 181 chars, this should be close to what you want: 如果要每181个字符插入一些内容,则应该接近您想要的内容:

def insert_break(s,n, br):
    while s:
        yield "{}{}".format(s[:n],br)
        s = s[n:]
with open('output.txt', 'r') as f,open('updated.txt', 'a') as f1:
    inserted= "".join((insert_break(f.read(),181,"\n")))
    f1.write(inserted)

with open('updated.txt', 'r') as infile,open("ABC1.txt","a") as ab1,open("ABC2.txt","a") as ab2,\
        open("ABC3.txt","a") as ab3,open("ABC4.txt","a") as ab4:
            for line in infile:
                print line.startswith("ABC1"),line
                if line.startswith("ABC1"):
                    ab1.write(line)
                elif line.startswith("ABC2"):
                    ab2.write(line)
                elif line.startswith("ABC3"):
                    ab3.write(line)
                elif line.startswith("ABC4"):
                    ab4.write(line)

If you want to find if a substring is in your string use if "your_sub_s" in line: 如果要查找字符串中是否包含子字符串,请if "your_sub_s" in line:使用if "your_sub_s" in line:

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM