如何比較2個列表，其中字符串與備用列表中的元素匹配

Question

嗨，我正在學習中，所以您可能不得不忍受我。 我有2個要比較的列表，同時保留所有匹配項，並在將任何不匹配項附加到另一個輸出列表時附加它們。 這是我的代碼：

def EntryToFieldMatch(Entry, Fields):
    valid = []
    invalid = []
    for c in Entry:
        count = 0
        for s in Fields:
            count +=1
            if s in c:
                valid.append(c)
            elif count == len(Entry):
                invalid.append(s)
                Fields.remove(s)



    print valid
    print "-"*50
    print invalid


def main():
    vEntry = ['27/04/2014', 'Hours = 28', 'Site = Abroad', '03/05/2015', 'Date = 28-04-2015', 'Travel = 2']
    Fields = ['Week_Stop', 'Date', 'Site', 'Hours', 'Travel', 'Week_Start', 'Letters']
    EntryToFieldMatch(vEntry, Fields)

if __name__ = "__main__":
    main()

除了不返回2個輸出列表中的所有字段之外，輸出看起來還不錯。 這是我收到的輸出：

['Hours = 28', 'Site = Abroad', 'Date = 28-04-2015', 'Travel = 2']
--------------------------------------------------
['Week_Start', 'Letters']

我只是不知道為什么第二個列表不包含“ Week_Stop”。 我已經運行了調試器，並按照幾次代碼進行了無濟於事。 我已經讀過關於集合的信息，但是沒有看到任何方法返回匹配的字段並丟棄不匹配的字段。 如果有人知道簡化整個過程的方法，我也很願意接受建議，我不是在要求免費的代碼，只是向正確的方向點頭。 Python 2.7，謝謝

Answer 1

您只有兩個條件，要么在字符串中，要么計數等於Entry的長度，都沒有捕獲第一個元素'Week_Stop' ，長度從7-6-5捕獲Week_Start但從Week_Start為0所以你永遠不會達到Week_Stop 。

如果要保持順序，更有效的方法是使用set或collections.OrderedDict ：

from collections import OrderedDict
def EntryToFieldMatch(Entry, Fields):
    valid = []
    # create orderedDict from the words in Fields
    # dict lookups are 0(1)
    st = OrderedDict.fromkeys(Fields)
    # iterate over Entry
    for word in Entry:
        # split the words once on whitespace
        spl = word.split(None, 1)
        # if the first word/word appears in our dict keys
        if spl[0] in st:
            # add to valid list
            valid.append(word)
            # remove the key
            del st[spl[0]]
    print valid
    print "-"*50
    # only invalid words will be left
    print st.keys()

輸出：

['Hours = 28', 'Site = Abroad', 'Date = 28-04-2015', 'Travel = 2']
--------------------------------------------------
['Week_Stop', 'Week_Start', 'Letters']

對於大型列表，這將大大快於您的二次方法。 擁有0(1)字典查找意味着每次您in Fields中執行0(n)運算時，代碼都會從二次變為線性。

使用集合的方法類似：

def EntryToFieldMatch(Entry, Fields):
    valid = []
    st = set(Fields)
    for word in Entry:
        spl = word.split(None,1)
        if spl[0] in st:
            valid.append(word)
            st.remove(spl[0])
    print valid
    print "-"*50
    print st

使用集的區別在於不維護訂單。

Answer 2

使用列表理解：

def EntryToFieldMatch(Entries, Fields):

    # using list comprehension 
    # (typically they go on one line, but they can be multiline 
    #  so they look more like their for loop equivalents)
    valid = [entry for entry in Entries
                 if any([field in entry 
                         for field in Fields])]

    invalidEntries = [entry for entry in Entries 
                          if not any([field in entry 
                                      for field in Fields])]

    missedFields = [field for field in Fields
                          if not any([field in entry 
                                      for entry in Entries])]

    print 'valid entries:', valid
    print '-' * 80
    print 'invalid entries:', invalidEntries
    print '-' * 80
    print 'missed fields:', missedFields

vEntry = ['27/04/2014', 'Hours = 28', 'Site = Abroad', '03/05/2015', 'Date = 28-04-2015', 'Travel = 2']
Fields = ['Week_Stop', 'Date', 'Site', 'Hours', 'Travel', 'Week_Start', 'Letters']
EntryToFieldMatch(vEntry, Fields)

valid entries: ['Hours = 28', 'Site = Abroad', 'Date = 28-04-2015', 'Travel = 2']
--------------------------------------------------------------------------------
invalid entries: ['27/04/2014', '03/05/2015']
--------------------------------------------------------------------------------
missed fields: ['Week_Stop', 'Week_Start', 'Letters']

如何比較2個列表，其中字符串與備用列表中的元素匹配

問題描述

2 個解決方案

解決方案1
0 已采納 2015-05-04 00:14:09

解決方案2
-1 2015-05-04 00:45:38

如何比較2個列表，其中字符串與備用列表中的元素匹配

問題描述

2 個解決方案

解決方案1 0 已采納 2015-05-04 00:14:09

解決方案2 -1 2015-05-04 00:45:38

解決方案1
0 已采納 2015-05-04 00:14:09

解決方案2
-1 2015-05-04 00:45:38