简体   繁体   English

字符串列表中的匹配字符串

[英]Matching String in a List of Strings

I basically want to create a new list 'T' which will match if each element in the list 'Word' exists as a separate element in the list 'Z'. 我基本上想创建一个新的列表“ T”,如果列表“ Word”中的每个元素作为单独的元素存在于列表“ Z”中,则该列表将匹配。 ie I want the output of 'T' in the following case to be T = ['Hi x'] 即我希望在以下情况下“ T”的输出为T = ['Hi x']

Word = ['x']
Z = ['Hi xo xo','Hi x','yoyo','yox']

I tried the following code but it gives me all sentences with words having 'x' in it however I only want the sentences having 'x' as a separate word. 我尝试了以下代码,但是它给了我所有带有'x'单词的句子,但是我只希望带有'x'作为单独单词的句子。

for i in Z:
    for v in i:
        if v in Word:
            print (i)

Just another pythonic way 只是另一种pythonic方式

[phrase for phrase in Z for w in Word if w in phrase.split()]
['Hi x']

You can do it with list comprehension. 您可以通过列表理解来做到这一点。

>>> [i for i in Z if any (w.lower() ==j.lower() for j in i.split() for w in Word)]
['Hi x']

Edit: 编辑:

Or you can do: 或者,您可以执行以下操作:

>>> [i for i in Z for w in Word if w.lower() in map(lambda x:x.lower(),i.split())]
['Hi x']
words = ['x']
phrases = ['Hi xo xo','Hi x','yoyo','yox']
for phrase in phrases:
    for word in words:
        if word in phrase.split():
            print(phrase)

if you want to print all strings from Z that contain a word from Word : 如果要打印Z中包含Word一个单词的所有字符串:

Word = ['xo']
Z = ['Hi xo xo','Hi x','yoyo','yox']

res = []
for i in Z:
    for v in i.split():
        if v in Word:
            res.append(i)
            break
print(res)

Notice the break . 注意break Without the break you could get some strings from Z twice, if two words from it would match. 如果没有中断,您可以从Z两次获得一些字符串,如果它的两个单词匹配的话。 Like the xo in the example. 就像示例中的xo一样。

The i.split() expression splits i to words on spaces. i.split()表达式将i拆分为空格上的单词。

If you would store Word as a set instead of list you could use set operations for check. 如果您将Word存储为一set而不是list ,则可以使用set操作进行检查。 Basically following splits every string on whitespace, constructs set out of words and checks if Word is subset or not. 基本上,以下操作会在空格上分割每个字符串,从单词中构造出set ,并检查Word是否为子集。

>>> Z = ['Hi xo xo','Hi x','yoyo','yox']
>>> Word = {'x'}
>>> [s for s in Z if Word <= set(s.split())]
['Hi x']
>>> Word = {'Hi', 'x'}
>>> [s for s in Z if Word <= set(s.split())]
['Hi x']

In above <= is same as set.issubset . 在上面, <=set.issubset相同。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM