简体   繁体   English

合并 Python 中的嵌套字典

[英]Merging nested dictionaries in Python

Background:背景:
I have a list of nested dictionaries named result Each nested dictionary has a string key (eg 'diet' ) and each dictionary value is a unique URL .我有一个名为result的嵌套字典列表每个嵌套字典都有一个字符串键(例如'diet' ),每个字典值都是唯一的 URL

Some examples demonstrated below -下面展示了一些示例 -

 [{'diet': 'https://www.simplyrecipes.com/recipes/diet/dairy-free/'},
 {'diet': 'https://www.simplyrecipes.com/recipes/diet/gluten-free/'},
 {'diet': 'https://www.simplyrecipes.com/recipes/diet/healthy/'},
 {'diet': 'https://www.simplyrecipes.com/recipes/diet/low_carb/'},
 {'diet': 'https://www.simplyrecipes.com/recipes/diet/paleo/'},
 {'diet': 'https://www.simplyrecipes.com/recipes/diet/vegan/'},
 {'diet': 'https://www.simplyrecipes.com/recipes/diet/vegetarian/'},
 {'main-ingredient': 'https://www.simplyrecipes.com/recipes/main-ingredient/beef/'},
 {'main-ingredient': 'https://www.simplyrecipes.com/recipes/main-ingredient/cheese/'},
 {'main-ingredient': 'https://www.simplyrecipes.com/recipes/main-ingredient/chicken/'},
 {'main-ingredient': 'https://www.simplyrecipes.com/recipes/main-ingredient/egg/'},
 {'main-ingredient': 'https://www.simplyrecipes.com/recipes/main-ingredient/fish/'},
 {'main-ingredient': 'https://www.simplyrecipes.com/recipes/main-ingredient/fish_and_seafood/'}]

I'm calling this out, just to give some context as to how I'll be using the keys and values: I am going to be writing a for loop, that will iterate through all the values / URLs and execute this code, that basically extracts the ingredients from each recipe's (URL) web page:我之所以这么说,只是为了提供一些关于我将如何使用键和值的上下文:我将编写一个 for 循环,它将遍历所有值/ URL 并执行此代码,即基本上是从每个配方的(URL)web 页面中提取成分:

from splinter import Browser
from webdriver_manager.chrome import ChromeDriverManager
resp = requests.get("https://www.simplyrecipes.com/recipes/egg_salad_sandwich/")
soup = BeautifulSoup(resp.text, "html.parser")
div_ = soup.find("div", attrs={"class": "recipe-callout"})
recipes = {"_".join(div_.find("h2").text.split()):
               [x.text for x in div_.findAll("li", attrs={"class": "ingredient"})]}

executable_path = {'executable_path': ChromeDriverManager().install()} browser = Browser('chrome', **executable_path) executable_path = {'executable_path': ChromeDriverManager().install()} browser = Browser('chrome', **executable_path)

Objective:客观的:
I would like to merge the dictionaries by key value.我想按键值合并字典。 The below examples shows the structure I'm hoping for -下面的例子显示了我希望的结构 -

[{'diet': 'https://www.simplyrecipes.com/recipes/diet/allergy-friendly/',
'https://www.simplyrecipes.com/recipes/diet/dairy-free/',
'https://www.simplyrecipes.com/recipes/diet/vegetarian/',
{'main-ingredient': 'https://www.simplyrecipes.com/recipes/main-ingredient/beef/'
'https://www.simplyrecipes.com/recipes/main-ingredient/lamb/',
'https://www.simplyrecipes.com/recipes/main-ingredient/chicken/'}]

My code:我的代码:
So far I have code as follows, however I've completely screwed it up;到目前为止,我的代码如下,但是我完全搞砸了; it does nothing useful and I don't know what part of my head it came from!它没有任何用处,我不知道它来自我脑袋的哪个部分!

master_dict = NestedDict(result)
for i in d:
    path = [i['diet'], i['ingredient']]
    master_dict[path] = i['https:']

When you analyse the page, the link - https://www.simplyrecipes.com/recipes gives all the links to different types of recipes.当您分析页面时,链接 - https://www.simplyrecipes.com/recipes提供了指向不同类型食谱的所有链接。 So, scraping that page and formatting correctly will give you the desired result.因此,抓取该页面并正确格式化将为您提供所需的结果。

import requests
from bs4 import BeautifulSoup
import pprint

res = requests.get("https://www.simplyrecipes.com/recipes")
soup = BeautifulSoup(res.text,"html.parser")

links = {}

for div in soup.find("div", class_="rnav-menus").find_all("div", class_="rnav-menu"):
    recipe_type = div.find("span").get_text(strip=True)
    links[recipe_type] = [i.find("a")["href"] for i in div.find_all("li")]

pprint.pprint(links)

Output: Output:

{'Course': ['https://www.simplyrecipes.com/recipes/course/appetizer/',
            'https://www.simplyrecipes.com/recipes/course/breakfast/',
            'https://www.simplyrecipes.com/recipes/course/brunch/',
            'https://www.simplyrecipes.com/recipes/course/dessert/',
            'https://www.simplyrecipes.com/recipes/course/dinner/',
            'https://www.simplyrecipes.com/recipes/course/drink/',
            'https://www.simplyrecipes.com/recipes/course/lunch/',
            'https://www.simplyrecipes.com/recipes/course/salad/',
            'https://www.simplyrecipes.com/recipes/course/sandwich/',
            'https://www.simplyrecipes.com/recipes/course/side_dish/',
            'https://www.simplyrecipes.com/recipes/course/snack/',
            'https://www.simplyrecipes.com/recipes/course/soup/',
            'https://www.simplyrecipes.com/recipes/course/soup_and_stew/',
            'https://www.simplyrecipes.com/recipes/course/stew/'],
 'Cuisine': ['https://www.simplyrecipes.com/recipes/cuisine/african/',
             'https://www.simplyrecipes.com/recipes/cuisine/basque/',
             'https://www.simplyrecipes.com/recipes/cuisine/belgian/',
             'https://www.simplyrecipes.com/recipes/cuisine/brazilian/',
             'https://www.simplyrecipes.com/recipes/cuisine/british/',
             'https://www.simplyrecipes.com/recipes/cuisine/cajun/',
             'https://www.simplyrecipes.com/recipes/cuisine/cambodian/',
             'https://www.simplyrecipes.com/recipes/cuisine/chinese/',
             'https://www.simplyrecipes.com/recipes/cuisine/cowboy/',
             'https://www.simplyrecipes.com/recipes/cuisine/creole/',
             'https://www.simplyrecipes.com/recipes/cuisine/danish/',
             'https://www.simplyrecipes.com/recipes/cuisine/ethiopian/',
             'https://www.simplyrecipes.com/recipes/cuisine/french/',
             'https://www.simplyrecipes.com/recipes/cuisine/german/',
             'https://www.simplyrecipes.com/recipes/cuisine/greek/',
             'https://www.simplyrecipes.com/recipes/cuisine/hawaiian/',
             'https://www.simplyrecipes.com/recipes/cuisine/hungarian/',
             'https://www.simplyrecipes.com/recipes/cuisine/indian/',
             'https://www.simplyrecipes.com/recipes/cuisine/irish/',
             'https://www.simplyrecipes.com/recipes/cuisine/italian/',
             'https://www.simplyrecipes.com/recipes/cuisine/jamaican/',
             'https://www.simplyrecipes.com/recipes/cuisine/japanese/',
             'https://www.simplyrecipes.com/recipes/cuisine/jewish/',
             'https://www.simplyrecipes.com/recipes/cuisine/korean/',
             'https://www.simplyrecipes.com/recipes/cuisine/latin-american/',
             'https://www.simplyrecipes.com/recipes/cuisine/mediterranean/',
             'https://www.simplyrecipes.com/recipes/cuisine/mexican/',
             'https://www.simplyrecipes.com/recipes/cuisine/mexican_and_tex_mex/',
             'https://www.simplyrecipes.com/recipes/cuisine/middle-eastern/',
             'https://www.simplyrecipes.com/recipes/cuisine/moroccan/',
             'https://www.simplyrecipes.com/recipes/cuisine/new_england/',
             'https://www.simplyrecipes.com/recipes/cuisine/new_orleans/',
             'https://www.simplyrecipes.com/recipes/cuisine/persian/',
             'https://www.simplyrecipes.com/recipes/cuisine/polish/',
             'https://www.simplyrecipes.com/recipes/cuisine/portuguese/',
             'https://www.simplyrecipes.com/recipes/cuisine/provencal/',
             'https://www.simplyrecipes.com/recipes/cuisine/puerto-rican/',
             'https://www.simplyrecipes.com/recipes/cuisine/southern/',
             'https://www.simplyrecipes.com/recipes/cuisine/southwestern/',
             'https://www.simplyrecipes.com/recipes/cuisine/spanish/',
             'https://www.simplyrecipes.com/recipes/cuisine/swedish/',
             'https://www.simplyrecipes.com/recipes/cuisine/texmex/',
             'https://www.simplyrecipes.com/recipes/cuisine/thai/',
             'https://www.simplyrecipes.com/recipes/cuisine/vietnamese/'],
 'Featured': ['https://www.simplyrecipes.com/hub/grill_recipes/',
              'https://www.simplyrecipes.com/hub/best_copycat_recipes_restaurant_favorites/',
              'https://www.simplyrecipes.com/hub/cookbook_club/',
              'https://www.simplyrecipes.com/category/meal-plans/',
              'https://www.simplyrecipes.com/category/eat-your-food/',
              'https://www.simplyrecipes.com/category/cooking-for-two/',
              'https://www.simplyrecipes.com/category/use-it-up/',
              'https://www.simplyrecipes.com/category/editors-picks/',
              'https://www.simplyrecipes.com/category/pantry-power/',
              'https://www.simplyrecipes.com/category/produce-guides/',
              'https://www.simplyrecipes.com/category/equipment-guides/'],
 'Ingredient': ['https://www.simplyrecipes.com/recipes/main-ingredient/beef/',
                'https://www.simplyrecipes.com/recipes/main-ingredient/cheese/',
                'https://www.simplyrecipes.com/recipes/main-ingredient/chicken/',
                'https://www.simplyrecipes.com/recipes/main-ingredient/egg/',
                'https://www.simplyrecipes.com/recipes/main-ingredient/fish/',
                'https://www.simplyrecipes.com/recipes/main-ingredient/fish_and_seafood/',
                'https://www.simplyrecipes.com/recipes/main-ingredient/fruit/',
                'https://www.simplyrecipes.com/recipes/main-ingredient/lamb/',
                'https://www.simplyrecipes.com/recipes/main-ingredient/pasta/',
                'https://www.simplyrecipes.com/recipes/main-ingredient/pork/',
                'https://www.simplyrecipes.com/recipes/main-ingredient/rice/',
                'https://www.simplyrecipes.com/recipes/main-ingredient/seafood/',
                'https://www.simplyrecipes.com/recipes/main-ingredient/turkey/',
                'https://www.simplyrecipes.com/recipes/main-ingredient/vegetables/'],
 'Season': ['https://www.simplyrecipes.com/recipes/season/birthday/',
            'https://www.simplyrecipes.com/recipes/season/christmas/',
            'https://www.simplyrecipes.com/recipes/season/easter/',
            'https://www.simplyrecipes.com/recipes/season/fathers-day/',
            'https://www.simplyrecipes.com/recipes/season/seasonal_favorites_fall/',
            'https://www.simplyrecipes.com/recipes/season/seasonal_favorites_spring/',
            'https://www.simplyrecipes.com/recipes/season/seasonal_favorites_summer/',
            'https://www.simplyrecipes.com/recipes/season/seasonal_favorites_winter/',
            'https://www.simplyrecipes.com/recipes/season/fourth-of-july/',
            'https://www.simplyrecipes.com/recipes/season/game-day/',
            'https://www.simplyrecipes.com/recipes/season/halloween/',
            'https://www.simplyrecipes.com/recipes/season/hanukkah/',
            'https://www.simplyrecipes.com/recipes/season/holiday/',
            'https://www.simplyrecipes.com/recipes/season/lent/',
            'https://www.simplyrecipes.com/recipes/season/mardi-gras/',
            'https://www.simplyrecipes.com/recipes/season/mothers_day/',
            'https://www.simplyrecipes.com/recipes/season/new-years-day/',
            'https://www.simplyrecipes.com/recipes/season/passover/',
            'https://www.simplyrecipes.com/recipes/season/st_patricks_day/',
            'https://www.simplyrecipes.com/recipes/season/super_bowl/',
            'https://www.simplyrecipes.com/recipes/season/thanksgiving/',
            'https://www.simplyrecipes.com/recipes/season/valentines_day/'],
 'Special Diets': ['https://www.simplyrecipes.com/recipes/diet/allergy-friendly/',
                   'https://www.simplyrecipes.com/recipes/diet/dairy-free/',
                   'https://www.simplyrecipes.com/recipes/diet/gluten-free/',
                   'https://www.simplyrecipes.com/recipes/diet/healthy/',
                   'https://www.simplyrecipes.com/recipes/diet/low_carb/',
                   'https://www.simplyrecipes.com/recipes/diet/paleo/',
                   'https://www.simplyrecipes.com/recipes/diet/vegan/',
                   'https://www.simplyrecipes.com/recipes/diet/vegetarian/'],
 'Type': ['https://www.simplyrecipes.com/recipes/type/1-pot/',
          'https://www.simplyrecipes.com/recipes/type/air-fryer/',
          'https://www.simplyrecipes.com/recipes/type/bbq/',
          'https://www.simplyrecipes.com/recipes/type/baking/',
          'https://www.simplyrecipes.com/recipes/type/budget/',
          'https://www.simplyrecipes.com/recipes/type/candy/',
          'https://www.simplyrecipes.com/recipes/type/canning/',
          'https://www.simplyrecipes.com/recipes/type/casserole/',
          'https://www.simplyrecipes.com/recipes/type/comfort_food/',
          'https://www.simplyrecipes.com/recipes/type/condiment/',
          'https://www.simplyrecipes.com/recipes/type/cookie/',
          'https://www.simplyrecipes.com/recipes/type/deep_fried/',
          'https://www.simplyrecipes.com/recipes/type/dip/',
          'https://www.simplyrecipes.com/recipes/type/freezer-friendly/',
          'https://www.simplyrecipes.com/recipes/type/grill/',
          'https://www.simplyrecipes.com/recipes/type/how_to/',
          'https://www.simplyrecipes.com/recipes/type/instant-pot/',
          'https://www.simplyrecipes.com/recipes/type/jams_and_jellies/',
          'https://www.simplyrecipes.com/recipes/type/kidfriendly/',
          'https://www.simplyrecipes.com/recipes/type/make-ahead/',
          'https://www.simplyrecipes.com/recipes/type/microwave/',
          'https://www.simplyrecipes.com/recipes/type/pantry-meal/',
          'https://www.simplyrecipes.com/recipes/type/pressure-cooker/',
          'https://www.simplyrecipes.com/recipes/type/quick/',
          'https://www.simplyrecipes.com/recipes/type/restaurant_favorite/',
          'https://www.simplyrecipes.com/recipes/type/salsa/',
          'https://www.simplyrecipes.com/recipes/type/sauce/',
          'https://www.simplyrecipes.com/recipes/type/sheet-pan-dinner/',
          'https://www.simplyrecipes.com/recipes/type/skillet-recipe/',
          'https://www.simplyrecipes.com/recipes/type/slow_cooker/',
          'https://www.simplyrecipes.com/recipes/type/sous-vide/',
          'https://www.simplyrecipes.com/recipes/type/stirfry/']}

I'm sure there's a concise FP approach, but this does the trick:我确信有一个简洁的 FP 方法,但这可以解决问题:

import collections

data = …
dd = collections.defaultdict(list)

for record in data:
    for key, value in record.items():
        dd[key].append(value)

When you run this on your data, you get:当你在你的数据上运行它时,你会得到:

>>> pprint(dict(dd))
{'diet': ['https://www.simplyrecipes.com/recipes/diet/dairy-free/',
          'https://www.simplyrecipes.com/recipes/diet/gluten-free/',
          'https://www.simplyrecipes.com/recipes/diet/healthy/',
          'https://www.simplyrecipes.com/recipes/diet/low_carb/',
          'https://www.simplyrecipes.com/recipes/diet/paleo/',
          'https://www.simplyrecipes.com/recipes/diet/vegan/',
          'https://www.simplyrecipes.com/recipes/diet/vegetarian/'],
 'main-ingredient': ['https://www.simplyrecipes.com/recipes/main-ingredient/beef/',
                     'https://www.simplyrecipes.com/recipes/main-ingredient/cheese/',
                     'https://www.simplyrecipes.com/recipes/main-ingredient/chicken/',
                     'https://www.simplyrecipes.com/recipes/main-ingredient/egg/',
                     'https://www.simplyrecipes.com/recipes/main-ingredient/fish/',
                     'https://www.simplyrecipes.com/recipes/main-ingredient/fish_and_seafood/']}

(The dict(…) part isn't strictly necessary, because a defaultdict is a dict , but it looks cleaner when we print it) dict(…)部分不是绝对必要的,因为defaultdict是一个 dict ,但是当我们打印它时它看起来更干净)

This one is also possible if you list of dicts is called result:如果您将字典列表称为结果,这也是可能的:

res = {k: [] for k in set(key for x in result for key in x)}

for obj in result:
    for key in res.keys():
        if obj.get(key):
            res[key] += [obj.get(key)]

print(res)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM