簡體   English   中英

在 Python 中的字符串列表中搜索字符串列表

[英]Search a list of list of strings inside a list of strings in Python

我想在 Python 中的另一個字符串列表中搜索字符串列表。 如果找到匹配項,我想檢索兩個列表的匹配字符串。 我也想獲得部分匹配。 清單 1 和清單 2 都很大,所以只提供一個示例

例子:

list 1 = [ 'The tablets are filled into cylindrically shaped bottles made of white coloured\npolyethylene. The volumes of the bottles depend on the tablet strength and amount of\ntablets, ranging from 20 to 175 ml. The screw type cap is made of white coloured\npolypropylene and is equipped with a tamper proof ring.', 'PVC/PVDC blister pack', 'Blisters are made in a cold-forming process from an aluminium base web. Each tablet is\nfilled into a separate blister and a lidding foil of aluminium is welded on. The blisters\nare opened by pressing the tablets through the lidding foil.', '\n']



list 2 = [['Blister', 'Foil', 'Aluminium'], ['Blister', 'Base Web', 'PVC/PVDC'], ['Bottle', 'Cylindrically shaped Bottles', 'Polyethylene'], ['Bottle', 'Screw Type Cap', 'Polypropylene'], ['Bottle', 'Safety Ring', ''], ['Blister', 'Base Web', 'PVC'], ['Blister', 'Base Web', 'PVD/PVDC'], ['Bottle', 'Square Shaped Bottle', 'Polyethylene']]

如果匹配項不存在於列表 1 的同一字符串中,則列表 1 中列表 2 的每個匹配項都應作為單獨的階段輸出

預期樣品 output:

Stage 1: 'The tablets are filled into cylindrically shaped bottles made of white coloured\npolyethylene. The volumes of the bottles depend on the tablet strength and amount of\ntablets, ranging from 20 to 175 ml. The screw type cap is made of white coloured\npolypropylene and is equipped with a tamper proof ring.', values : ['Bottle', 'Cylindrically shaped Bottles', 'Polyethylene']

Stage 2: 'Blisters are made in a cold-forming process from an aluminium base web. Each tablet is\nfilled into a separate blister and a lidding foil of aluminium is welded on. The blisters\nare opened by pressing the tablets through the lidding foil.', Values: ['Blister', 'Foil', 'Aluminium']

比賽條件:

1.) 我想匹配忽略列表 1 中的 \n。
2.) 我想匹配列表 1 中的列表 2,忽略復數/單數,這意味着應該匹配列表 1 中作為 'bottles' 出現的 'Bottle'。

我已經嘗試過在 stackoverflow 上找到的這段代碼,但並沒有真正起作用。 無法使用此代碼獲得多個匹配項,也無法從列表 1 中檢索包含列表 2 值的整個字符串。這僅列出了列表 2 中的一些值:

from itertools import product

def generate_edges(iterable, control):
    edges = []
    control_set = set(control)
    for e in iterable:
        e_set = set(e)
        common = e_set & control_set
        to_pair = e_set - common
        edges.extend(product(to_pair, common))
    return edges

generate_edges(list2, list1)

最新變化:

counter = 1

for words in final_ref:
    for sen in paragraphs:
        all_exist = True
        for w in words:
            if w.lower() not in sen.lower():
                all_exist = False
                break
        if all_exist:
            #print(words[0])
            colours = ["White","Yellow","Blue","Red","Green","Black","Brown","Silver","Purple","Navy blue","Gray","Orange","Maroon","pink","colourless","blue"]
            if words[0] == 'Bottle':
                for wd in colours:
                    if wd in sen.split():
                        wd = wd

                        #print(wd)
#                        wordsnew = wd + words[0]
#                        print(wordsnew)
#            else:
#                wordsnew = words
#                print(wordsnew)
#                break



                    #print(wd)

            fr = "Stage " + str(counter) + ": " + "Package Description" + ": " + sen + " Values" + ": " + str(words) + "Colour" + ": " + str(wd) + "\n" + "\n" + "\n"
            result.append(fr)
            result = [i.replace('\n','') for i in result]
            result = [i.replace('\t','') for i in result]
            counter += 1
print(result)

通常,您需要付出努力才能獲得回復,但這一次將幫助您:

counter = 1
for words in list2:
    for sen in list1:
        all_exist = True
        for w in words:
            if w.lower() not in sen.lower():
                all_exist = False
                break
        if all_exist:
            print("Stage " + str(counter) + ": " + sen + " Values" + str(words) + "\n")
            counter += 1

Output:

Stage 1: Blisters are made in a thermo-forming process from a PVC/PVDC base web. Each tablet
is filled into a separate blister and a lidding foil of aluminium is welded on. The blisters
are opened by pressing the tablets through the lidding foil. PVDC foil is in contact with
the tablets. Values['Blister', 'Foil', 'Aluminium']

Stage 2: Blisters are made in a cold-forming process from an aluminium base web. Each tablet is
filled into a separate blister and a lidding foil of aluminium is welded on. The blisters
are opened by pressing the tablets through the lidding foil. Values['Blister', 'Foil', 'Aluminium']

Stage 3: Blisters are made in a thermo-forming process from a PVC/PVDC base web. Each tablet
is filled into a separate blister and a lidding foil of aluminium is welded on. The blisters
are opened by pressing the tablets through the lidding foil. PVDC foil is in contact with
the tablets. Values['Blister', 'Base Web', 'PVC/PVDC']

Stage 4: The tablets are filled into cylindrically shaped bottles made of white coloured
polyethylene. The volumes of the bottles depend on the tablet strength and amount of
tablets, ranging from 20 to 175 ml. The screw type cap is made of white coloured
polypropylene and is equipped with a tamper proof ring. Values['Bottle', 'Cylindrically shaped Bottles', 'Polyethylene']

Stage 5: The tablets are filled into cylindrically shaped bottles made of white coloured
polyethylene. The volumes of the bottles depend on the tablet strength and amount of
tablets, ranging from 20 to 175 ml. The screw type cap is made of white coloured
polypropylene and is equipped with a tamper proof ring. Values['Bottle', 'Screw Type Cap', 'Polypropylene']

Stage 6: Blisters are made in a thermo-forming process from a PVC/PVDC base web. Each tablet
is filled into a separate blister and a lidding foil of aluminium is welded on. The blisters
are opened by pressing the tablets through the lidding foil. PVDC foil is in contact with
the tablets. Values['Blister', 'Base Web', 'PVC']

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM