簡體   English   中英

在Python中的列表列表中搜索字符串列表的完全匹配

[英]Searching exact match of a list of strings inside a list of lists in Python

我有一個列表列表:

result = [['GELATIN', '76.0 mg', '40 %', 'Gelatin to 100.000 g Table 7 Capsule Quantity per unit flavouring dose Quantity per unit dose Components Nominal mass of capsule 76.0 mg In the cap (40 %) 30.4 mg flavouring agent corresponds to 1 '], 
          ['GELATIN', '45.6 mg', '14.5 %', 'Gelatin including water of a certain percentage'], 
          ['INK', '76.0 mg', '40 %', 'ink is used as diluent far as this is necessary for the markets. Table 4 Atenolol granules Components mg/capsule Granules Active ingredients Atenolol 50.00]]

和一個字符串列表:

agent = ['Flavouring Agent', 'Anti-Tacking Agent', 'Preservative', 'Colouring Agent', 'Ph Adjusting Agent', 'Plasticizer', 'Diluent']

對於result每個子列表,我想從agent列表中搜索位於子列表中任何位置的元素。 如果存在這樣的元素,則將其作為新元素添加到子列表的開頭。

預期輸出:

new_result = [['Flavouring Agent', 'GELATIN', '76.0 mg', '40 %', 'Gelatin to 100.000 g Table 7 Capsule Quantity per unit flavouring dose Quantity per unit dose Components Nominal mass of capsule 76.0 mg In the cap (40 %) 30.4 mg flavouring agent corresponds to 1 '], 
              ['GELATIN', '45.6 mg', '14.5 %', 'Gelatin including water of a certain percentage'], 
              ['Diluent', 'INK', '76.0 mg', '40 %', 'ink is used as diluent far as this is necessary for the markets. Table 4 Atenolol granules Components mg/capsule Granules Active ingredients Atenolol 50.00]]

這是因為'Flavouring Agent'存在於第一個子列表的最后一個元素中; 並且'Diluent'存在於最后一個子列表的最后一個元素中。

努力到現在:

newl=[]                
for jj in agent:        
    for e in result:
        for ll in e:

            if jj in ll:
                #print(jj,ll)
                newl.append([jj,ll])
                break

我認為您的問題是與網絡級別以及循環順序混淆。 假設您想保留原始列表的順序(而不是省略元素),您的外循環應該在列表中。 然后,您要檢查列表中是否存在來自agent任何單詞。 我們可以使用“標志”變量來只添加一個“代理”:

res = []
for sub in result:
    new_sub = sub
    agent_found = False
    for ag in agent:
        if agent_found:
            break
        for item in sub:
            if ag.lower() in item.lower():
                new_sub = [ag] + new_sub
                agent_found = True
                break
    if not agent_found:
        new_sub = [" "] + new_sub
    res.append(new_sub)

給出:

[['Flavouring Agent', 'GELATIN', '76.0 mg', '40 %', 'Gelatin to 100.000 g Table 7 Capsule Quantity per unit flavouring dose Quantity per unit dose Components Nominal mass of capsule 76.0 mg In the cap (40 %) 30.4 mg flavouring agent corresponds to 1 '], 
 ['GELATIN', '45.6 mg', '14.5 %', 'Gelatin including water of a certain percentage'], 
 ['Diluent', 'INK', '76.0 mg', '40 %', 'ink is used as diluent far as this is necessary for the markets. Table 4 Atenolol granules Components mg/capsule Granules Active ingredients Atenolol 50.00']]
new_result=[]

for l in result:
    temp_results=[]
    for ag in agent:
        if ag in l:
            temp_results.append(ag)
    new_result.append(temp_result)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM