简体   繁体   中英

Specific sequence stored in a list

I am looking for a way to efficiently search for a lists that have particular sequence of values. The sequence is important! For example:

[x,y,z] and [x,z,y] contain the same values but their sequence is different

However:

  • [x,y,z], [y,z,x] and [z,x,y] are all the same to me.
  • [x,z,y], [z,y,x] and [x,z,y] are all the same too.

I think bout running a script that would look for parts of connections. For example, if I am looking for [x,y,z] I would look

mylist1 = ['a','b','c']
mylist2 = ['b','a','c']
def is_sequence_same(thelist,somelist):
    if (thelist[0] == somelist[0] and thelist[1] == somelist[1]):
       return True
    if (thelist[1] == somelist[1] and thelist[2] == somelist[2]):
        return True
    if (thelist[0] == somelist[1] and thelist[1] == somelist[0]):
        return False
    if  (thelist[0] == somelist[2] and thelist[1] == somelist[2]):
        return False
    else:
        return None
is_sequence_same(mylist1,mylist2)

Function returns: True - if the sequence is the same as I have asked, False - if the sequence is opposite

My current function is incomplete. However, I think that there should be more elegant ways of solving the problem

Use a deque:

from collections import deque

def is_sequence_same(l1, l2):
    if l1 == l2:
        return True
    if set(l1) != set(l2) or len(l1) != len(l2):
        return False
    d2 = deque(l2)
    for i in range(len(l2)):
        if l1 == list(d2):
            return True
        d2.rotate()
    return False

This may be slow with very long lists, but it essentially does a list comparison while rotating through the various possible 'start points' of the sequence the list represents. I'm assuming there may be more than one of each character, so you can't go straight to the first match of mylist[0]

mylist = ['a','b','c']
wontmatch = ['b','a','c']
willmatch = ['c','a','b']

def sequence_equal(list1,list2):
    for r in range(0,len(list1)):
        if list1 == list2:
            return True
        # Take the entry from the last index, and put it at the front, 
        # 'rotating' the list by 1
        list1.insert(0,list1.pop())
    return False

print sequence_equal(mylist,willmatch)
print sequence_equal(mylist,wontmatch)

(Edit: This manually recreates the deque from Magnus' answer.)

Since it's specific cycle you're looking for you can modify both lists to start with the same element and then compare them. Works with any list size. The assumption is that elements of the list are unique.

def is_sequence_same(list_a, list_b):
    if list_a and list_a[0] in list_b:                 # List_a not empty and first element exists in list_b
        first = list_b.index(list_a[0])                # Locate first element of list_a in list_b
    else:
        return False
    return list_a == (list_b[first:] + list_b[:first]) # Slice and compare

For example:

a = [1, 2, 3]
b = [3, 1, 2]
c = [2, 1, 3]

> is_sequence_same(a, b)
> True

> is_sequence_same(b, c)
> False
> 
> is_sequence_same(a, c)
> False

If by efficiently you mean sub-linearly (ie: you don't want to search through each element one-by-one), a good technique is to perform data normalization .

If your elements have a order, as in your example, this is particularly easy:

def normalize_sequence( seq ):
    return tuple(sorted( seq )) #tuple is needed because lists are unhashable

Using this technique, you can easily use a dictionary or a set to perform fast lookups:

existing_elements= set( map( normalize_sequence, ([1,4,2],[4,5,7]) ) )
print normalize_sequence( [1,2,4] ) in existing_elements

This is much faster than iterating and comparing over each element, particularly for larger lists.

This assumes that lists are guaranteed to be non empty and of the same length:

def is_sequence_same(first, second):
    try:
        i = second.index(first[0])
        if i == -1:
            return False
        for e in first[1:]:
            i += 1
            if i == len(second):
                i = 0
            if e != second[i]:
                return False
        return True
    except ValueError:
        return False

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM