Let's say I have a list like this:
[['she', 'is', 'a', 'student'],
['she', 'is', 'a', 'lawer'],
['she', 'is', 'a', 'great', 'student'],
['i', 'am', 'a', 'teacher'],
['she', 'is', 'a', 'very', 'very', 'exceptionally', 'good', 'student']]
Now I have a list like this:
['she', 'is', 'student']
I want to query the larger list with this one, and return all the lists that contain the words within the query list in the same order. There might be gaps, but the order should be the same. How can I do that? I tried using the in
operator but I don't get the desired output.
If all that you care about is that the words appear in order somehwere in the array, you can use a collections.deque
and popleft
to iterate through the list, and if the deque
is emptied, you have found a valid match:
from collections import deque
def find_gappy(arr, m):
dq = deque(m)
for word in arr:
if word == dq[0]:
dq.popleft()
if not dq:
return True
return False
By comparing each word
in arr
with the first element of dq
, we know that when we find a match, it has been found in the correct order, and then we popleft
, so we now are comparing with the next element in the deque
.
To filter your initial list, you can use a simple list comprehension that filters based on the result of find_gappy
:
matches = ['she', 'is', 'student']
x = [i for i in x if find_gappy(i, matches)]
# [['she', 'is', 'a', 'student'], ['she', 'is', 'a', 'great', 'student'], ['she', 'is', 'a', 'very', 'very', 'exceptionally', 'good', 'student']]
You can compare two lists, with a function like this one. The way it works is it loops through your shorter list, and every time it finds the next word in the long list, cuts off the first part of the longer list at that point. If it can't find the word it returns false.
def is_sub_sequence(long_list, short_list):
for word in short_list:
if word in long_list:
i = long_list.index(word)
long_list = long_list[i+1:]
else:
return False
return True
Now you have a function to tell you if the list is the desired type, you can filter out all the lists you need from the 'list of lists' using a list comprehension like the following:
a = [['she', 'is', 'a', 'student'],
['she', 'is', 'a', 'lawer'],
['she', 'is', 'a', 'great', 'student'],
['i', 'am', 'a', 'teacher'],
['she', 'is', 'a', 'very', 'very', 'exceptionally', 'good', 'student']]
b = ['she', 'is', 'student']
filtered = [x for x in a if is_sub_sequence(x,b)]
The list filtered
will include only the lists of the desired type.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.