I am trying to create what must be a simple filter function which runs a regex against a text file and returns all words containing that particular regex.
so for example if i wanted to find all words that contained "abc", and I had the list: abcde
, bce
, xyz
and zyxabc
the script would return abcde
and zyxabc
.
I have a script below however I am not sure if it is just the regex I am failing at or not. it just returns abc twice rather than the full word. thanks.
import re
text = open("test.txt", "r")
regex = re.compile(r'(abc)')
for line in text:
target = regex.findall(line)
for word in target:
print word
I think you dont need regex for such task you can simply split
your lines to create a list of words then loop over your words list and use in
operator :
with open("test.txt") as f :
for line in f:
for w in line.split():
if 'abc' in w :
print w
Your methodology is correct however, you can change your Regex to r'.*abc.*'
, in the sense
regex = re.compile(r'.*abc.*')
This will match all the lines with abc
in them The wildcards
.*` will match all your letters in the line.
A small Demo with that particular line changed would print
abcde
zyxabc
Note, As Kasra mentions it is better to use in
operator in such cases
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.