简体   繁体   中英

Simple Filter Python script for Text

I am trying to create what must be a simple filter function which runs a regex against a text file and returns all words containing that particular regex.

so for example if i wanted to find all words that contained "abc", and I had the list: abcde , bce , xyz and zyxabc the script would return abcde and zyxabc .

I have a script below however I am not sure if it is just the regex I am failing at or not. it just returns abc twice rather than the full word. thanks.

import re

text = open("test.txt", "r")
regex = re.compile(r'(abc)')

for line in text:
    target = regex.findall(line)
    for word in target:
        print word

I think you dont need regex for such task you can simply split your lines to create a list of words then loop over your words list and use in operator :

 with open("test.txt") as f :
     for line in f:
         for w in line.split():
              if 'abc' in w :
                   print w 

Your methodology is correct however, you can change your Regex to r'.*abc.*' , in the sense

 regex = re.compile(r'.*abc.*')

This will match all the lines with abc in them The wildcards .*` will match all your letters in the line.

A small Demo with that particular line changed would print

abcde
zyxabc

Note, As Kasra mentions it is better to use in operator in such cases

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM