简体   繁体   English

Python:提取文本文件中两个字符串之间的值

[英]Python: extract values between two strings in text file

I have a dialogue text file like this:我有一个这样的对话文本文件:

    Mom: 
Hi
    Dad: 
Hi
    Mom: 
Bye
    Dad: 
Bye
    Dad: 
:)

I have to copy both speakers lines to their own text files (mom.txt and dad.txt) This works but the problem is if there is two or more lines of same speaker in a row.我必须将两个扬声器行复制到它们自己的文本文件(mom.txt 和 dad.txt)这可行,但问题是如果连续有两行或多行相同的扬声器。

 def sort(path):
    inFile= open(path, 'r')
    inFile1= open(path, 'r')
    copy = False
    outFile = open('mom.txt', 'w')
    outFile1 = open('dad.txt', 'w')
    keepCurrentSet = False
    for line in inFile:
        if line.startswith("Dad:"):
            keepCurrentSet = False

        if keepCurrentSet:
            outFile.write(line)

        if line.startswith("Mom:"):
            keepCurrentSet = True

    for line1 in inFile1:
        if line1.startswith("Mom:"):
            keepCurrentSet = False

        if keepCurrentSet:
            outFile1.write(line1)

        if line1.startswith("Dad:"):
            keepCurrentSet = True


    outFile.close()        
    outFile1.close()
    inFile1.close()

The outFile1 outcome looks like this: outFile1 结果如下所示:

Hi
Bye
Dad:
:)

And should look like:应该看起来像:

Hi
Bye
:)

Ideas or easier ways to do this?想法或更简单的方法来做到这一点? thanks谢谢

Here is one way you can write mom.txt and dad.txt in one single loop:这是您可以在一个循环中编写mom.txtdad.txt的一种方法:

 def sort(path):
    inFile= open(path, 'r')
    inFile1= open(path, 'r')
    copy = False
    outFile = open('mom.txt', 'w')
    outFile1 = open('dad.txt', 'w')
    keepCurrentSetDad = False
    keepCurrentSetMom = False
    for line in inFile:
        print("--->",line)
        if 'Dad' in line:
            keepCurrentSetDad = True
            keepCurrentSetMom = False
            continue
        elif 'Mom' in line:
            keepCurrentSetMom = True
            keepCurrentSetDad = False
            continue
        if keepCurrentSetDad:
            outFile1.write(line)
        elif keepCurrentSetMom:
            outFile.write(line)
    outFile.close()        
    outFile1.close()
    inFile1.close()

I have merely edited your code.我只是编辑了你的代码。 Please check your txt file.请检查您的 txt 文件。 In whatever you have shown here, the speaker is on one line, and the speaker' words are in the next line.无论你在这里展示什么,说话者都在一行,说话者的话在下一行。 I have stuck to that format.我一直坚持这种格式。

You can use:您可以使用:

def sort(path):
    with open(path) as f,\
            open('mom.txt', 'w') as mom,\
            open('dad.txt', 'w') as dad:
        curr = None # keep tracks of current speaker
        for line in f:
            if 'Mom:' in line:
                curr = 'Mom' # set the current speaker to Mom
            elif 'Dad:' in line:
                curr = 'Dad' # set the current speaker to Dad
            else:
                if curr == 'Mom':
                    mom.write(line)
                elif curr == 'Dad':
                    dad.write(line)

The resulting mom.txt and dad.txt file should look like:生成的mom.txtdad.txt文件应如下所示:

# mom.txt
Hi
Bye

# dad.txt
Hi
Bye
:)

I've got even shorter answer where only one condition has to be checked inside the loop.我得到的答案更短,在循环内只需要检查一个条件。 Depending on your language version, you can choose one of the two:根据您的语言版本,您可以选择以下两者之一:

Python 3.7+ Python 3.7+

def sort(path):
    with open(path, 'r') as inFile, open('mom.txt', 'w+') as momFile, open('dad.txt', 'w+') as dadFile:
        line = inFile.readline()
        while line != '':
            if line.startswith('Mom:'):
                momFile.write(inFile.readline())
            elif line.startswith('Dad:'):
                dadFile.write(inFile.readline())
            line = inFile.readline()

Python 3.8+, (notice the walrus operator := ) Python 3.8+,(注意海象运算符:=

def sort(path):
    with open(path, 'r') as inFile, open('mom.txt', 'w+') as momFile, open('dad.txt', 'w+') as dadFile:
        while (line := inFile.readline()) != '':
            if line.startswith('Mom:'):
                momFile.write(inFile.readline())
            elif line.startswith('Dad:'):
                dadFile.write(inFile.readline())

Output: Output:

mom.txt:
Hi
Bye

dad.txt:
Hi
Bye
:)

Let me know if you spot some mistake or a possible improvement.如果您发现一些错误或可能的改进,请告诉我。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用python在文本文件中的两个字符串之间提取值 - Extract Values between two strings in a text file using python 使用 Python 提取文本文件中两个字符串之间的文本 - Extract text present in between two strings in a text file using Python 使用 Python 提取文本文件中两个字符串之间的文本数据 - Extract textual data in between two strings in a text file using Python Python将两个字符串之间的文本提取到Excel中 - Python Extract Text between two strings into Excel 使用BeautifulSoup和Python从网页中提取两个文本字符串之间的文本 - Extract text between two text strings from webpage with BeautifulSoup and Python 使用 python 在两个字符串之间提取多行文本 - Extract multiline text between two strings using python 如果在 Python 中使用正则表达式在两个字符串之间存在子字符串,则提取两个字符串之间的文本 - Extract text between two strings if a substring exists between the two strings using Regex in Python python3提取txt文件中两个字符串之间的字符串 - python3 extract string between two strings in a txt file 在Python的文本文件中使用变量重复提取两个定界符之间的文本 - Repeatedly extract text between two delimiters with a variable in a text file in Python Python正则表达式提取两个值之间的文本 - Python regular expression extract the text between two values
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM