简体   繁体   English

从 python 中的 txt 文件中提取行

[英]Extract lines from a txt file in python

Imagine I have this text in a txt file:想象一下,我在一个 txt 文件中有这个文本:

bla bla bla呜呜呜
bla bla bla呜呜呜
Title Lorem ipsum dolor sit amet, consectetur adipiscing标题 Lorem ipsum dolor sit amet, consectetur adipiscing
elit, sed do eiusmod tempor incididunt ut labore et dolore精英,sed 做 eiusmod tempor incididunt ut labore et dolore
magna aliqua.麦格纳阿里夸Ut enim ad minim veniam, Ut enim ad minim veniam,
condition (健康)状况
bla bla bla呜呜呜
bla bla bla呜呜呜
Title Sed ut perspiciatis unde omnis iste natus error sit voluptatem标题 Sed ut perspiciatis unde omnis iste natus error sit voluptatem
accusantium doloremque laudantium, totam rem aperiam, accusantium doloremque laudantium, totam rem aperiam,
eaque ipsa quae ab illo inventore veritatis eaque ipsa quae ab illoinvente veritatis
condition (健康)状况
bla bla bla呜呜呜

From the text with the structure above (hundred of lines), I want to extract the lines that start with 'title' until I find the line that starts with the word 'condition'.从具有上述结构的文本(数百行)中,我想提取以“title”开头的行,直到找到以“condition”开头的行。 So the result would be something like this:所以结果会是这样的:

Title Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.标题 Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua。 Ut enim ad minim veniam, Ut enim ad minim veniam,

Title Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis标题 Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illoinventore veritatis

I get to select the first like with this code, but I don't know how to add the next lines until I find the word 'condition'.我第一次使用此代码到达 select,但在找到“条件”一词之前,我不知道如何添加下一行。 Could you help me, please?请问你能帮帮我吗?

outF = open("myOutFile.txt", "w")
hand = open('doubt.txt', encoding="utf8")
for line in hand:
    line = line.rstrip()
    if re.search('^Title',line) :       
       outF.write(line); outF.write("\n")
       outF.write("\n")
outF.close()```

In case you want all titles until the first condition line appears, you need to break the loop:如果您想要所有标题直到出现第一个条件行,您需要break循环:

for line in hand:
    line = line.rstrip()
    if line.startswith("Title"):       
       outF.writelines([line])
    if line.startswith("condition"):
         break

outF.close()

In case you want to write all lines after a title till the next condition appears:如果您想在标题之后写下所有行,直到出现下一个条件:

write = False
writelines = []

for line in hand:
    line = line.rstrip()
    
    if line.startswith("condition"):
       write = False
       writelines.append("\n")
       
    if line.startswith("Title"):       
       write = True
    
    if write:
         writelines.append(line + " ")

outF.writelines(writelines)  
outF.close()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM