Python regexes: return a list of words containing a given substring

Question

What would be a function f based on regexes that, given an input text and a string, returns all the words containing this string in the text. For example:

f("This is just a simple text to test some basic things", "si")

would return:

["simple", "basic"]

(because these two words contain the substring "si" )

How to do that?

Answer 1

For something like this i wouldn't use regex, I would use something like this:

def f(string, match):
    string_list = string.split()
    match_list = []
    for word in string_list:
        if match in word:
            match_list.append(word)
    return match_list

print f("This is just a simple text to test some basic things", "si")

Answer 2

I'm not convinced there isn't a better way to do this than my approach, but something like:

import re

def f(s, pat):
    pat = r'(\w*%s\w*)' % pat       # Not thrilled about this line
    return re.findall(pat, s)


print f("This is just a simple text to test some basic things", "si")

Works:

['simple', 'basic']

Answer 3

Here is my attempt at a solution. I split the input string by " ", and then try to match each individual word to the pattern. If a match is found, the word is added to a result set.

import re

def f(str, pat):
    matches = list()
    str_list = str.split(' ');

    for word in str_list:
        regex = r'' + re.escape(word)
        match = re.search(regex, word)
        if match:
            matches.append(word)
    return matches

print f("This is just a simple text to test some basic things", "si")

Answer 4

import re

def func(s, pat):
    pat = r'\b\S*%s\S*\b' % re.escape(pat) 
    return re.findall(pat, s)


print func("This is just a simple text to test some basic things", "si")

You need this . \\b will take out only words by cutting at word boundary. \\S will not select any space .

Python regexes: return a list of words containing a given substring

Question

4 answers

solution1
5 2015-03-17 02:13:29

solution2
4 ACCPTED 2015-03-17 02:18:49

solution3
0 2015-03-17 02:37:10

solution4
0 2015-03-17 04:41:28

Python regexes: return a list of words containing a given substring

Question

4 answers

solution1 5 2015-03-17 02:13:29

solution2 4 ACCPTED 2015-03-17 02:18:49

solution3 0 2015-03-17 02:37:10

solution4 0 2015-03-17 04:41:28

solution1
5 2015-03-17 02:13:29

solution2
4 ACCPTED 2015-03-17 02:18:49

solution3
0 2015-03-17 02:37:10

solution4
0 2015-03-17 04:41:28