[英]Grep a range of words from a text file python
I have a text file and my goal is to generate an output file with all the words that are between two specific words. 我有一个文本文件,我的目标是生成一个输出文件,其中包含两个特定单词之间的所有单词。
For example, if I have this text: 例如,如果我有这个文字:
askdfghj... Hello world my name is Alex and I am 18 years all ...askdfgj.
And I want to obtain all the words between "my" and "Alex". 我想获得“我的”和“亚历克斯”之间的所有词语。
Output: 输出:
my name is Alex
I have it in mind... but I don't know how to create the range: 我记得了......但我不知道如何创建范围:
if 'my' in open(out).read():
with open('results.txt', 'w') as f:
if 'Title' in open(out).read():
f.write('*')
break
I want an output file with the sentence "my name is Alex". 我想要一个带有句子“我的名字是Alex”的输出文件。
You can use regex
here: 你可以在这里使用
regex
:
>>> import re
>>> s = "askdfghj... Hello world my name is Alex and I am 18 years all ...askdfgj."
>>> re.search(r'my.*Alex', s).group()
'my name is Alex'
If string contains multiple Alex
after my
and you want only the shortest match then use .*?
如果字符串在
my
之后包含多个Alex
并且您只想要最短的匹配,那么使用.*?
: :
With ?
用
?
: :
>>> s = "my name is Alex and you're Alex too."
>>> re.search(r'my.*?Alex', s).group()
'my name is Alex'
Without ?
没有
?
: :
>>> re.search(r'my.*Alex', s).group()
"my name is Alex and you're Alex"
Code: 码:
with open('infile') as f1, open('outfile', 'w') as f2:
data = f1.read()
match = re.search(r'my.*Alex', data, re.DOTALL)
if match:
f2.write(match.group())
You can use the regular expression my.*Alex
你可以使用正则表达式
my.*Alex
data = "askdfghj... Hello world my name is Alex and I am 18 years all ...askdfgj"
import re
print re.search("my.*Alex", data).group()
Output 产量
my name is Alex
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.