简体   繁体   中英

How to loop through two list dictionaries to append unique key values to one list dict and continue through other with shared repeating value entries?

I'm having trouble figuring out how to merge unique key values from one list dict to another list dict in which the dictionaries have repeating values.

The two list dicts are generated from different sources that have some repeating "name" values and don't always follow the same index. So there are times when a correlating set of key values appears in one list dict further in its index (or possibly earlier). Also, the two list dicts may not be the same length with one containing more entries than the other.

I've tried a few nested loops, but I can't seem to figure out how to handle simultaneous indexing and restarting the loop for entires that have not yet matched. I'd also receive an error at the end because of the mismatching range in list dictionaries.

Here's what I tried:

listOne = [{'name': 'Article 1 series', 'description': 'aaa'},
           {'name': 'Article 2', 'description': 'bbb'},
           {'name': 'Article 1 series', 'description': 'abb'},
           {'name': 'Article 3 series', 'description': 'cccc'}]

listTwo = [{'name': 'Article 1 series', 'link': 'www.google.com'},
           {'name': 'Article 2', 'link': 'www.yahoo.com'},
           {'name': 'Article 3 series', 'link': 'www.bing.com'},
           {'name': 'Article 1 series', 'link': 'www.google.com/test'},
           {'name': 'Article 4', 'link': 'www.duckduckgo.com'}]

firstList = len(listOne)
secondList = len(listTwo)

while listTwo:    
     for i in range(firstList):
         if i <= secondList:
             if listOne[i]["name"] == listTwo[0]["name"]:
                 print("found")
                 listOne.append(listTwo[0]["link"])
             else:
                 continue
         else:
             break
     listTwo.pop(0)

Ultimately, as "name" key values are matched I'd like to merge the "link" key value into the corresponding found listOne index. Otherwise look for the previous unmatched or next matching "name" key value. If no "name" key value exists in listOne or all of the indexes in listOne have been matched, then stop the loop and drop any remaining entires from listTwo.

So listOne should look like this:

listOne = [{'name': 'Article 1 series', 'description': 'aaa','link': 'www.google.com'},
           {'name': 'Article 2', 'description': 'bbb', 'link': 'www.yahoo.com'},
           {'name': 'Article 1 series', 'description': 'abb', 'link': 'www.google.com/test'},
           {'name': 'Article 3 series', 'description': 'cccc', 'link': 'www.bing.com'}]

From what I understand, you can simply do this:

merged =[{**one, **two} for one, two in zip(listOne, listTwo)]
print(merged)

Which outputs :

[
   {'name': 'Article 1 series', 'description': 'aaa', 'link': 'www.google.com'},
   {'name': 'Article 2', 'description': 'bbb', 'link': 'www.yahoo.com'}, 
   ...
]

If you have huge lists then start using pandas

from pandas as pd
one_df = pd.DataFrame(listOne)

OUTPUT:
  description              name
0         aaa  Article 1 series
1         bbb         Article 2
2         abb  Article 1 series
3        cccc  Article 3 series


second_df = pd.DataFrame(listTwo)

OUTPUT:
                  link              name
0       www.google.com  Article 1 series
1        www.yahoo.com         Article 2
2         www.bing.com  Article 3 series
3  www.google.com/test  Article 1 series
4   www.duckduckgo.com         Article 4

merged_df = df.groupby("link").first()

OUTPUT:
                    description              name
link                                             
www.bing.com               cccc  Article 3 series
www.google.com              aaa  Article 1 series
www.google.com/test         aaa  Article 1 series
www.yahoo.com               bbb         Article 2

 data = []
 for index, row in df.iterrows():
    data_dict = {}
    data_dict["link"] = index
    data_dict["description"] = row.description
    data_dict["name"] = row.name
    data.append(data_dict)

OUTPUT of data: 
[{'description': 'cccc', 'link': 'www.bing.com', 'name': 'www.bing.com'},
 {'description': 'aaa', 'link': 'www.google.com', 'name': 'www.google.com'},
 {'description': 'aaa', 'link': 'www.google.com/test', 'name': 'www.google.com/test'},
 {'description': 'bbb', 'link': 'www.yahoo.com', 'name': 'www.yahoo.com'}]

In the code you have tried, you have just one flaw which is not giving you the required result.

listOne.append(listTwo[0]["link"])

You are appending the value of the link to listOne .

Or in other terms, you are appending www.google.com to listOne . Whereas, you want them to be as a key-value pair in listOne[i] .

Hence, your result is showing up of this sorts:

[{'name': 'Article 1 series', 'description': 'aaa'}, {'name': 'Article 2', 'description': 'bbb'}, {'name': 'Article 1 series', 'description': 'abb'}, {'name': 'Article 3 series', 'description': 'cccc'}, 'www.google.com', 'www.google.com', 'www.yahoo.com', 'www.bing.com', 'www.google.com/test', 'www.google.com/test']

As clearly seen above, your link is getting appended to listOne .

However, if you add the key-value pair to the dictionary (in your case listOne[i] ), your code will work perfectly. (The key being 'link' and the value being listTwo[i]['link'] )

Here is the corrected code:

listOne = [{'name': 'Article 1 series', 'description': 'aaa'},
           {'name': 'Article 2', 'description': 'bbb'},
           {'name': 'Article 1 series', 'description': 'abb'},
           {'name': 'Article 3 series', 'description': 'cccc'}]

listTwo = [{'name': 'Article 1 series', 'link': 'www.google.com'},
           {'name': 'Article 2', 'link': 'www.yahoo.com'},
           {'name': 'Article 3 series', 'link': 'www.bing.com'},
           {'name': 'Article 1 series', 'link': 'www.google.com/test'},
           {'name': 'Article 4', 'link': 'www.duckduckgo.com'}]


while listTwo:
    for i in range(len(listOne)):
        if i >= len(listTwo):
            break
        if not 'link' in listOne[i].keys():
            if listOne[i]['name'] == listTwo[i]['name']:
                print('found')
                listOne[i]['link'] = listTwo[i]['link']
            else:
                continue
        else:
            continue

    listTwo.pop(0)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM