簡體   English   中英

遞歸地減少元組列表

[英]Recursively reduce list of tuples

所以我有一個像這樣的元組列表:

[
    ('Worksheet',),
    ('1a', 'Calculated'),
    ('None', 'None', 'None', 'None', 'None'),
    ('1b', 'General'),
    ('1b', 'General', 'Basic'),
    ('1b', 'General', 'Basic', 'Data'),
    ('1b', 'General', 'Basic', 'Data', 'Line 1'),
    ('1b', 'General', 'Basic', 'Data', 'Line 2'),
    ('None', 'None', 'None', 'None', 'None'),
    ('1c', 'General'),
    ('1c', 'General', 'Basic'),
    ('1c', 'General', 'Basic', 'Data'),
    ('None', 'None', 'None', 'None', 'None'),
    ('2', 'Active'),
    ('2', 'Active', 'Passive'),
    ('None', 'None', 'None', 'None', 'None'),
    ...
]

每個元組的長度為1-5。 我需要以遞歸方式減少列表以最終結果:

[
    ('Worksheet',),
    ('1a', 'Calculated'),
    ('None', 'None', 'None', 'None', 'None'),
    ('1b', 'General', 'Basic', 'Data', 'Line 1'),
    ('1b', 'General', 'Basic', 'Data', 'Line 2'),
    ('None', 'None', 'None', 'None', 'None'),
    ('1c', 'General', 'Basic', 'Data'),
    ('None', 'None', 'None', 'None', 'None'),
    ('2', 'Active', 'Passive'),
    ('None', 'None', 'None', 'None', 'None'),
    ...
]

基本上,如果下一行與前一行中的所有行匹配,則將其移除到具有相同層次結構的元組的最大長度。

因此,在我的例子中可以看到有3行,其中1c是元組中的第一個項目,所以它減少到最長。

將元組分組在第一個元素上; 使用itertools.groupby() (使用operator.itemgetter()以便於創建密鑰。

然后分別過濾每個組:

from itertools import groupby, chain
from operator import itemgetter

def filtered_group(group):
    group = list(group)
    maxlen = max(len(l) for l in group)
    return [l for l in group if len(l) == maxlen]

filtered = [filtered_group(g) for k, g in groupby(inputlist, key=itemgetter(0))]
output = list(chain.from_iterable(filtered))

演示:

>>> from itertools import groupby, chain
>>> from operator import itemgetter
>>> from pprint import pprint
>>> def filtered_group(group):
...     group = list(group)
...     maxlen = max(len(l) for l in group)
...     return [l for l in group if len(l) == maxlen]
... 
>>> filtered = [filtered_group(g) for k, g in groupby(inputlist, key=itemgetter(0))]
>>> pprint(list(chain.from_iterable(filtered)))
[('Worksheet',),
 ('1a', 'Calculated'),
 ('None', 'None', 'None', 'None', 'None'),
 ('1b', 'General', 'Basic', 'Data', 'Line 1'),
 ('1b', 'General', 'Basic', 'Data', 'Line 2'),
 ('None', 'None', 'None', 'None', 'None'),
 ('1c', 'General', 'Basic', 'Data'),
 ('None', 'None', 'None', 'None', 'None'),
 ('2', 'Active', 'Passive'),
 ('None', 'None', 'None', 'None', 'None')]
from pprint import pprint

l=[
    ('Worksheet',),
    ('1a', 'Calculated'),
    ('None', 'None', 'None', 'None', 'None'),
    ('1b', 'General'),
    ('1b', 'General', 'Basic'),
    ('1b', 'General', 'Basic', 'Data'),
    ('1b', 'General', 'Basic', 'Data', 'Line 1'),
    ('1b', 'General', 'Basic', 'Data', 'Line 2'),
    ('None', 'None', 'None', 'None', 'None'),
    ('1c', 'General'),
    ('1c', 'General', 'Basic'),
    ('1c', 'General', 'Basic', 'Data'),
    ('None', 'None', 'None', 'None', 'None'),
    ('2', 'Active'),
    ('2', 'Active', 'Passive'),
    ('None', 'None', 'None', 'None', 'None')
    #...
]

i=0
while i<len(l)-1:
  l0=l[i]
  l1=l[i+1]
  if len(l1)==len(l0)+1 and l1[:-1]==l0:
    del l[i]
  else:
    i+=1

pprint(l)

邏輯:比較下一行(除了最后一行)。 如果下一個與另外一個項目相同,請刪除第一個。 否則,前進到下一對線。

這不是遞歸解決方案,但可以重新設計。 這是一個過濾操作,您需要條件中的下一個項目。

只是為了好玩,這里是一個遞歸的Haskell版本(這種類型的遞歸在Haskell和Scheme中很有效,但不是Python):

prefixfilt :: Eq a => [[a]] -> [[a]]
prefixfilt [] = []
prefixfilt [x] = [x]
prefixfilt (x0:x1:xs) =
    if x0 == init x1 then rest else (x0:rest)
    where rest = prefixfilt (x1:xs)
def is_subtuple(tup1, tup2):
    '''Return True if all the elements of tup1 are consecutively in tup2.'''
    if len(tup2) < len(tup1): return False
    try:
        offset = tup2.index(tup1[0])
    except ValueError:
        return False
    # This could be wrong if tup1[0] is in tup2, but doesn't start the subtuple.
    # You could solve this by recurring on the rest of tup2 if this is false, but
    # it doesn't apply to your input data.
    return tup1 == tup2[offset:offset+len(tup1)] 

然后,只需過濾輸入列表(在此處命名為l ):

[t for i, t in enumerate(l) if not any(is_subtuple(t, t2) for t2 in l[i+1:])]

現在,這個列表理解假定輸入列表按照你顯示的方式一致排序,子數比它們所在的元組早。它也有點貴( O(n**2) ,我認為),但它會完成工作。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM