简体   繁体   English

在字符串列表中查找特定的单词组

[英]Find specific group of words in a string list

Let's say I have a list of strings, generated from a text:假设我有一个从文本生成的字符串列表:

string_lst = ['A', 'cat', 'is', 'a', 'cat', 'but', 'a', 'dog', 'is', 'not', 'a', 'cat']

( The condition is to keep the string list separated as is. ) (条件是保持字符串列表按原样分隔。)

Imagine that the strings have the following assignments:想象一下字符串有以下分配:

A | WORD

cat | ANIMAL

is | WORD

a | WORD

cat | ANIMAL

but | WORD

a | WORD

dog | ANIMAL

is | WORD

not | WORD

a | WORD

cat | ANIMAL

I need you to find only the strings, in their natural sequence, ( ANIMAL + WORD + WORD + ANIMAL ), that is, give me a group of words in that exact sequence.我需要你只找到字符串,按照它们的自然顺序(ANIMAL + WORD + WORD + ANIMAL),也就是说,按照确切的顺序给我一组单词。 Using the example above, the result will be: 'cat is a cat', 'cat but a dog'使用上面的例子,结果将是:'cat is a cat', 'cat but a dog'

Any idea?任何想法?

You just need a list of all animals你只需要一份所有动物的清单

string_lst = ['A', 'cat', 'is', 'a', 'cat', 'but', 'a', 'dog', 'is', 'not', 'a', 'cat']

animals = ['dog', 'cat']
result = []
for i in range(len(string_lst) - 4):
    if string_lst[i] in animals and string_lst[i + 1] not in animals and string_lst[i + 2] not in animals and string_lst[i + 3] in animals:
        result.append(string_lst[i:i + 4])

print(result)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM