python-遞歸列表理解以將lstrip應用於列表

Question

我有兩個清單如下

f = ['sum_','count_','per_']
d = ['fav_genre','sum_fav_event','count_fav_type','per_fav_movie']

所以我想將f中的每個字符串的lstrip應用到列表d的所有項目上，以便我可以

d = ['fav_genre','fav_event','fav_type','fav_movie']

我想使用列表理解。 但是我知道我也可以通過其他方式做到這一點，例如使用re.sub，每次對d的列表項應用替換

 #example
 d = [re.sub(r'.*fav', 'fav', x) for x in d] #####gives what i want
 ## but if fav (which in this case a matching pattern) is not there in d then this solution won't work
 ## d = ['fav_genre','sum_any_event','count_some_type','per_all_movie']
 #re.sub can't be applied on this d(as before) as no matching char like 'fav' found

因此，列表壓縮是我選擇執行的操作。

到目前為止，我已經嘗試了..

d_one = [x.lstrip('count_') for x in d]   ###only count_ is stripped
# o/p- d-one = ['fav_genre', 'sum_fav_event', 'fav_type', 'per_fav_movie']
# so i c_n apply lstrip of each string from f on items of d
## why not apply all items lstrip in one go ### so tried
d_new = [x.lstrip(y) for y in f for x in d]
###['fav_genre', 'fav_event', 'count_fav_type', 'per_fav_movie', 'fav_genre', 'sum_fav_event', 'fav_type', 'per_fav_movie', 'fav_genre', 'sum_fav_event', 'count_fav_type', 'fav_movie']

因此，它為我應用lstrip的每次迭代提供了結果

請提出如何在列表理解中一次性應用所有lstrip（遞歸）。 提前致謝。

Answer 1

嘗試這個：

>>> f = ['sum_','count_','per_']
>>> d = ['fav_genre','sum_fav_event','count_fav_type','per_fav_movie']
>>> [s[len(([p for p in f if s.startswith(p)]+[""])[0]):] for s in d]
['fav_genre', 'fav_event', 'fav_type', 'fav_movie']

我相信這可以按預期處理所有案件。

Answer 2

可以使用以下方法，該方法基於f創建合適的正則表達式：

import re

f = ['sum_','count_','per_']
d = ['fav_genre','sum_fav_event','count_fav_type','per_fav_movie']

re_prefix = re.compile(r'^({})'.format('|'.join(f)))
print [re_prefix.sub('', entry) for entry in d]

還是單線（效率不高）：

print [re.sub(r'^({})'.format('|'.join(f)), '', entry) for entry in d]

提供以下輸出：

['fav_genre', 'fav_event', 'fav_type', 'fav_movie']

Answer 3

這是您要找的東西嗎？

>>> f = ['sum_','count_','per_']
>>> d = ['fav_genre','sum_fav_event','count_fav_type','per_fav_movie']
>>> [x[len(y):] for x in d for y in f if x.startswith(y)]
['fav_event', 'fav_type', 'fav_movie']

編輯：對此我戳的越多，我發現列表理解不可能的越多。 問題似乎在於包含不匹配條件，但是當迭代f中的其他項目時，簡單的“ else”會導致d中的每個項目都包括在內。

為實例

>>> [x[len(y):] if x.startswith(y) else x for x in d for y in f
['fav_genre', 'fav_genre', 'fav_genre', 'fav_event', 'sum_fav_event', 'sum_fav_event', 'count_fav_type', 'fav_type', 'count_fav_type', 'per_fav_movie', 'per_fav_movie', 'fav_movie']

這會創建一個包含太多項目的新列表。

向list comp添加另一個條件會生成語法錯誤：

[x[len(y):] if x.startswith(y) else x if x[len(y):] not in f for x in d for y in f]
File "<stdin>", line 1
  [x[len(y):] if x.startswith(y) else x if x[len(y):] not in f for x in d for y in f]
                                                               ^
SyntaxError: invalid syntax

即使我們可以通過列表理解來做到這一點，一個函數的可讀性也會更高：

def strip_prefixes(prefixes, mylist):
    for element in mylist:
        for x in prefixes:
            if element.startswith(x):
                element = element[len(x):]
    return mylist

Answer 4

不要為列表理解而煩惱。 列表理解，非常類似於map / reduce語法糖。 通過使用簡單的功能，您將更容易閱讀解決方案。

import re

f = ['sum_','count_','per_']
d = ['fav_genre','sum_fav_event','count_fav_type','per_fav_movie']
def makeTrimmer(patterns):
    regex = re.compile("^(%s)" % "|".join(patterns))

    def trimmer(string):

        old_string = string             
        new_string = re.sub(regex, "", old_string)

        while len(old_string) != len(new_string):
            old_string = new_string
            new_string = re.sub(regex, "", old_string)

        return new_string

    return trimmer

trimmer = makeTrimmer(f)
vals = [trimmer(x) for x in d]
print vals

如您所見， trimmer函數的可讀性很強，您可能可以在列表理解中做到這一點，但是沒有簡單的方法可以做到這一點。 因為列表理解的if部分的工作方式非常類似於列表上要輸出的事物上的過濾器。 for部分組合條目，而第一部分構建條目輸出。 在您的情況下，您只需要基於多個前綴構建正確的輸出...換句話說，您不會嘗試將所有前綴和所有值組合到多個輸出中，也不會過濾任何結果。

我的方法可能可以通過lambda實現，但這很可能很難看。

沒有lambda的非遞歸方法：

vals = [
    re.sub(re.compile("^(%s)" % "|".join(f)), "", x)
    for x in d
]                      
print vals

這是使用匿名lambda的完整遞歸代碼：

# -*- coding: utf-8 -*-
import re

f = ['sum_','count_','per_']
d = ['fav_genre','sum_fav_event','count_fav_type','per_fav_movie']

vals = [
    (lambda a, *b: a(a, *b))(
        (lambda loop, newstring, oldstring:
            newstring
            if len(newstring) == len(oldstring) else
                loop(
                    loop,
                    newstring,
                    re.sub(re.compile("^(%s)" % "|".join(f)), "", x)
                )
        ),
        re.sub(re.compile("^(%s)" % "|".join(f)), "", x),
        x
    )
    for x in d
]

print vals

除了我們使用遞歸方法來進行進一步過濾外，這與上述方法幾乎相同，因此該方法將諸如sum_count_per_fun_avg清除為fun_avg 。

另外，請勿使用lambda方法，因為它效率低下。

但是，這里有一個更高效的lambda版本：

vals = [
    (lambda regex:
        (lambda a, *b: a(a, *b))(
            (lambda loop, newstring, oldstring:
                newstring
                if len(newstring) == len(oldstring) else
                    loop(                                                                                                                   
                        loop,
                        newstring,
                        re.sub(regex, "", x)
                    )
            ),
            re.sub(regex, "", x),
            x
        )
    )(re.compile("^(%s)" % "|".join(f)))
    for x in d
]

我們只編譯一次正則表達式。 但是python中的遞歸仍然是一個問題，因此您不應過多使用遞歸。

Answer 5

我要去睡覺了，但是正在為此工作。 我認為以這種方式進行操作可能不是最好的主意，因為它有很多循環且可讀性不強。 這也不是很正確。

d_new = set([(x,y) for x in [x.split(y)[1] for y in f for x in d if x.startswith(y)] for y in [x for x in d if x.startswith('fav')]])

當前將它們放入元組，您可以在集合內為x添加另一個x來提取不同的元組對。 在這一點上，盡管我什至不認為使用列表理解是有用的還是不值得的，但是，如果您真的想使用列表理解，這可能會給您一個開始。

編輯：

代碼如下所示：

[（'fav_movie'，'fav_genre'），（'fav_event'，'fav_genre'），（'fav_type'，'fav_genre'）]

python-遞歸列表理解以將lstrip應用於列表

問題描述

5 個解決方案

解決方案1
3 已采納 2016-02-08 06:41:54

解決方案2
3 2016-02-08 07:10:01

解決方案3
1 2016-02-08 06:07:20

解決方案4
1 2016-02-08 06:54:24

解決方案5
1 2016-02-08 07:06:24

python-遞歸列表理解以將lstrip應用於列表

問題描述

5 個解決方案

解決方案1 3 已采納 2016-02-08 06:41:54

解決方案2 3 2016-02-08 07:10:01

解決方案3 1 2016-02-08 06:07:20

解決方案4 1 2016-02-08 06:54:24

解決方案5 1 2016-02-08 07:06:24

解決方案1
3 已采納 2016-02-08 06:41:54

解決方案2
3 2016-02-08 07:10:01

解決方案3
1 2016-02-08 06:07:20

解決方案4
1 2016-02-08 06:54:24

解決方案5
1 2016-02-08 07:06:24