[英]Check for Partial Match in 1 List With Partial Match Another List - Possible With List Comprehension?
有點Python /編程新手。
我編寫了滿足我需要的代碼:
import re
syns = ['professionals|experts|specialists|pros', 'repayed|payed back', 'ridiculous|absurd|preposterous', 'salient|prominent|significant' ]
new_syns = ['repayed|payed back', 'ridiculous|crazy|stupid', 'salient|prominent|significant', 'winter-time|winter|winter season', 'professionals|pros']
def pipe1(syn):
# Find first word/phrase in list element up to and including the 1st pipe
r = r'.*?\|'
m = re.match(r, syn)
m = m.group()
return m
def find_non_match():
# Compare 'new_syns' with 'syns' and create new list from non-matches in 'new_syns'
p = '@#&' # Place holder created
joined = p.join(syns)
joined = p + joined # Adds place holder to beginning of string too
non_match = []
for syn in new_syns:
m = pipe1(syn)
m = p + m
if m not in joined:
non_match.append(syn)
return non_match
print find_non_match()
打印輸出:
['winter-time|winter|winter season']
該代碼檢查new_syns
每個元素的單詞/詞組(包括第一個管道)並new_syns
匹配是否與syns
列表中的相同部分匹配項匹配。 該代碼的目的是實際找到不匹配項,然后將它們附加到名為non_match
的新列表中。
但是,我想知道是否有可能實現相同的目的,但是使用列表理解的行數卻要少得多。 我已經嘗試過,但是我沒有得到我想要的。 到目前為止,這是我想出的:
import re
syns = ['professionals|experts|specialists|pros', 'repayed|payed back', 'ridiculous|absurd|preposterous', 'salient|prominent|significant' ]
new_syns = ['repayed|payed back', 'ridiculous|crazy|stupid', 'salient|prominent|significant', 'winter-time|winter|winter season', 'professionals|pros']
def pipe1(syn):
# Find first word/phrase in list element up to and including the 1st pipe
r = r'.*?\|'
m = re.match(r, syn)
m = '@#&' + m.group() # Add unusual symbol combo to creatte match for beginning of element
return m
non_match = [i for i in new_syns if pipe1(i) not in '@#&'.join(syns)]
print non_match
打印輸出:
['winter-time|winter|winter season', 'professionals|pros'] # I don't want 'professionals|pros' in the list
列表理解中的警告是,當使用@#&
加入syns
時,在現在加入的字符串的開頭沒有@#&
,而在上面的原始代碼中我不使用列表理解,所以我添加了@#&
到連接字符串的開頭。 結果是'professionals|pros'
漏網了。 但是我不知道如何在列表理解中做到這一點。
所以我的問題是“列表理解有可能嗎?”。
我認為您想要類似的東西:
non_match = [i for i in new_syns if not any(any(w == s.split("|")[0]
for w in i.split("|"))
for s in syns)]
這不使用正則表達式,但是可以給出結果
non_match == ['winter-time|winter|winter season']
該列表包括new_syns
中的所有項目,其中'|'
都不存在( not any
) -分隔詞語w
是在any
的第一個字( split("|")[0]
的每個同義詞組s
從syns
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.