[英]Python: extract values between two strings in text file
I have a dialogue text file like this:我有一个这样的对话文本文件:
Mom:
Hi
Dad:
Hi
Mom:
Bye
Dad:
Bye
Dad:
:)
I have to copy both speakers lines to their own text files (mom.txt and dad.txt) This works but the problem is if there is two or more lines of same speaker in a row.我必须将两个扬声器行复制到它们自己的文本文件(mom.txt 和 dad.txt)这可行,但问题是如果连续有两行或多行相同的扬声器。
def sort(path):
inFile= open(path, 'r')
inFile1= open(path, 'r')
copy = False
outFile = open('mom.txt', 'w')
outFile1 = open('dad.txt', 'w')
keepCurrentSet = False
for line in inFile:
if line.startswith("Dad:"):
keepCurrentSet = False
if keepCurrentSet:
outFile.write(line)
if line.startswith("Mom:"):
keepCurrentSet = True
for line1 in inFile1:
if line1.startswith("Mom:"):
keepCurrentSet = False
if keepCurrentSet:
outFile1.write(line1)
if line1.startswith("Dad:"):
keepCurrentSet = True
outFile.close()
outFile1.close()
inFile1.close()
The outFile1 outcome looks like this: outFile1 结果如下所示:
Hi
Bye
Dad:
:)
And should look like:应该看起来像:
Hi
Bye
:)
Ideas or easier ways to do this?想法或更简单的方法来做到这一点? thanks
谢谢
Here is one way you can write mom.txt
and dad.txt
in one single loop:这是您可以在一个循环中编写
mom.txt
和dad.txt
的一种方法:
def sort(path):
inFile= open(path, 'r')
inFile1= open(path, 'r')
copy = False
outFile = open('mom.txt', 'w')
outFile1 = open('dad.txt', 'w')
keepCurrentSetDad = False
keepCurrentSetMom = False
for line in inFile:
print("--->",line)
if 'Dad' in line:
keepCurrentSetDad = True
keepCurrentSetMom = False
continue
elif 'Mom' in line:
keepCurrentSetMom = True
keepCurrentSetDad = False
continue
if keepCurrentSetDad:
outFile1.write(line)
elif keepCurrentSetMom:
outFile.write(line)
outFile.close()
outFile1.close()
inFile1.close()
I have merely edited your code.我只是编辑了你的代码。 Please check your txt file.
请检查您的 txt 文件。 In whatever you have shown here, the speaker is on one line, and the speaker' words are in the next line.
无论你在这里展示什么,说话者都在一行,说话者的话在下一行。 I have stuck to that format.
我一直坚持这种格式。
You can use:您可以使用:
def sort(path):
with open(path) as f,\
open('mom.txt', 'w') as mom,\
open('dad.txt', 'w') as dad:
curr = None # keep tracks of current speaker
for line in f:
if 'Mom:' in line:
curr = 'Mom' # set the current speaker to Mom
elif 'Dad:' in line:
curr = 'Dad' # set the current speaker to Dad
else:
if curr == 'Mom':
mom.write(line)
elif curr == 'Dad':
dad.write(line)
The resulting mom.txt
and dad.txt
file should look like:生成的
mom.txt
和dad.txt
文件应如下所示:
# mom.txt
Hi
Bye
# dad.txt
Hi
Bye
:)
I've got even shorter answer where only one condition has to be checked inside the loop.我得到的答案更短,在循环内只需要检查一个条件。 Depending on your language version, you can choose one of the two:
根据您的语言版本,您可以选择以下两者之一:
Python 3.7+ Python 3.7+
def sort(path):
with open(path, 'r') as inFile, open('mom.txt', 'w+') as momFile, open('dad.txt', 'w+') as dadFile:
line = inFile.readline()
while line != '':
if line.startswith('Mom:'):
momFile.write(inFile.readline())
elif line.startswith('Dad:'):
dadFile.write(inFile.readline())
line = inFile.readline()
Python 3.8+, (notice the walrus operator :=
) Python 3.8+,(注意海象运算符
:=
)
def sort(path):
with open(path, 'r') as inFile, open('mom.txt', 'w+') as momFile, open('dad.txt', 'w+') as dadFile:
while (line := inFile.readline()) != '':
if line.startswith('Mom:'):
momFile.write(inFile.readline())
elif line.startswith('Dad:'):
dadFile.write(inFile.readline())
Output: Output:
mom.txt:
Hi
Bye
dad.txt:
Hi
Bye
:)
Let me know if you spot some mistake or a possible improvement.如果您发现一些错误或可能的改进,请告诉我。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.