简体   繁体   中英

How to extract an equal sequence of elements in a list in Python?

I have a more specific question, but I haven't found the answer yet. I'm really desperate and would be really happy if someone knew the answer. Thank you in advance for reading ...

I have a list in Python that looks something like this:

["h", "e", "l", "l", "o", "h", "e", "l", "l", "o", "h", "e", "l", "l", "o"]

Now I want to shorten the list so that a block of elements is filtered out that is repeated many times. This means that this list becomes:

["h", "e", "l", "l", "o"]

Does anyone know how this works? The problem: The list can always look different, maybe like this:

["b", "y", "e", "b", "y", "e", "b", "y", "e"]

Thank you very much and I would really appreciate your answer!

This can be handled quite neatly with a single line function (see below).

import re

def shorten(l):
  return list(re.sub(r'^([a-z]+)\1+$',r'\1', ''.join(l)))


l1 = ["h", "e", "l", "l", "o", "h", "e", "l", "l", "o", "h", "e", "l", "l", "o"]
l2 = ["b", "y", "e", "b", "y", "e", "b", "y", "e"]

print(shorten(l1))
print(shorten(l2))

Output

['h', 'e', 'l', 'l', 'o']
['b', 'y', 'e']

Explanation

The above solution treats the list ( l ) passed at runtime as a str of characters in index order.

It makes use of the regex pattern ^([az]+)\\1+$ to identify whether the whole str is made up of a substring that is repeated - ie is l made up of a single repeating pattern from start to finish?

If this pattern produces a match on the l str , a list representing this repeating pattern (match group 1 ( \\1 ) is returned.

If no match is made - ie l is not made up entirely of a single repeating pattern - then a list identical to l passed at runtime is returned.

This is a possible solution:

def shorten(lst):
    s = ''.join(lst)
    for i in range(1, int(len(s) / 2) + 1):
        if len(s) % i == 0:
            if s[0: i] * int(len(s) / i) == s:
                return list(s[0: i])
    return list(s)

Here are some examples:

>>> shorten(['h','e','l','l','o','h','e','l','l','o','h','e','l','l','o'])
['h', 'e', 'l', 'l', 'o']
>>> shorten(['b','y','e','b','y','e'])
['b', 'y', 'e']
>>> shorten(['a','b','c'])
['a', 'b', 'c']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM