简体   繁体   English

在Python中合并不同的字典

[英]Merging different dictionaries in Python

This is a long question so please bear with me. 这是一个很长的问题,请耐心等待。 I start out with 3 dicts obtained from 3 APIs. 我首先从3个API中获得3个字典。 the dicts have a structure like so: 字典的结构如下:

API1 = {'results':[{'url':'www.site.com','title':'A great site','snippet':'This is a great site'},
{'url':'www.othersite.com','title':'Another site','snippet':'This is another site'},
{'url':'www.wiki.com','title':'A wiki site','snippet':'This is a wiki site'}]}

API2 = {'hits':[{'url':'www.dol.com','title':'The DOL site','snippet':'This is the dol site'},
{'url':'www.othersite.com','title':'Another site','snippet':'This is another site'},
{'url':'www.whatever.com','title':'Whatever site','snippet':'This is a site about whatever'}]}

API3 = {'output':[{'url':'www.dol.com','title':'The DOL site','snippet':'This is the dol site'},
{'url':'www.whatever.com','title':'Whatever site','snippet':'This is a site about whatever'},
{'url':'www.wiki.com','title':'A wiki site','snippet':'This is a wiki site'}]}

I extract the URL keys from API1, API2 and API3 to do some processing. 我从API1,API2和API3中提取URL密钥以进行一些处理。 I do this because there is quite a bit of processing to be done and only the URLs are needed. 之所以这样做,是因为要做很多处理,并且只需要URL。 When finished I have a list of the URL's with the duplicates removed and another list of scores that are relative to each URL's position in the list: 完成后,我有一个URL列表,其中删除了重复项,还有另一个与列表中每个URL位置相关的分数列表:

URLlist = ['www.site.com','www.wiki.com','www.othersite.com','www.dol.com','www.whatever.com']

Results = [1.2, 6.5, 3.5, 2.1, 4.0]

What I have done is created a new dictionary from these 2 lists using the zip() function. 我完成的工作是使用zip()函数从这两个列表中创建了一个新字典。

ScoredResults = dict(zip(URLlist,Results))

{'www.site.com':1.2,'www.wiki.com':6.5, 'www.othersite.com':3.5, 'www.dol.com':2.1, 'www.whatever.com':4.0}

Now what I need to do is to link the URL's from ScoredResults with API1 , API2 or API3 so that I have a new dictionary like so: 现在,我需要做的是将ScoredResults的URL与API1API2API3链接API1 ,这样我便有了一个新的字典,如下所示:

Full Results = 
{'www.site.com':{'title':'A great site','snippet':'This is a great site','score':1.2},
 'www.othersite.com':{'title':'Another site','snippet':'This is another site','score':3.5},
...}

This is too difficult for me to do. 这对我来说太难了。 If you look back on my question history I have been asking numerous dictionary questions but no implementation has worked so far. 如果您回顾我的问题历史记录,我一直在询问许多词典问题,但是到目前为止,还没有实现。 If anyone could please point me in the right direction I would very much appreciate it. 如果有人可以指出正确的方向,我将非常感激。

I would transform the API's into something that is more meaningful for you. 我会将API转换为对您来说更有意义的东西。 A dict of urls is probably more appropriate: 网址的字典可能更合适:

def transform_API(API):
    list_of_dict=API.get('results',API.get('hits',API.get('output')))
    if(list_of_dict is None):
       raise KeyError("results, hits or output not in API")
    d={}
    for dct in list_of_dict:
        d[dct['url']]=dct
        dct.pop('url')
    return d

API1=transform_API(API1)
API2=transform_API(API2)
API3=transform_API(API3)

master={}
for d in (API1,API2,API3):
    master.update(d)

urls=list(master.keys())
scores=get_scores_from_urls(urls)

for k,score in zip(urls,scores):
    master[k]['score']=score

With the given data… 根据给定的数据...

Full_Results = {d['url']: {'title': d['title'], 'snippet': d['snippet'], 'score': ScoredResults[d['url']]} for d in API1['results']+API2['hits']+API3['output']}

resulting into: 导致:

{'www.dol.com': {'score': 2.1,
  'snippet': 'This is the dol site',
  'title': 'The DOL site'},
 'www.othersite.com': {'score': 3.5,
  'snippet': 'This is another site',
  'title': 'Another site'},
 'www.site.com': {'score': 1.2,
  'snippet': 'This is a great site',
  'title': 'A great site'},
 'www.whatever.com': {'score': 4.0,
  'snippet': 'This is a site about whatever',
  'title': 'Whatever site'},
 'www.wiki.com': {'score': 6.5,
  'snippet': 'This is a wiki site',
  'title': 'A wiki site'}}

A quick attempt: 快速尝试:

from itertools import chain

full_result = {}

for blah in chain.from_iterable(d.itervalues() for d in (API1, API2, API3)):
    for d in blah:
        full_result[d['url']] = {
            'title': d['title'],
            'snippet': d['snippet'],
            'score': ScoredResults[d['url']]
        }

print full_result

Would something like that work for you ? 这样的事情对您有用吗? It's rather basic, constructing your final dictionary by looping on URLlist . 这很基本,它通过在URLlist循环来构造最终字典。

API1r = API1['results']
API2r = API2['hits']
API3r = API3['output']

FullResults = {}
for (U, R) in zip(URLlist, Results):
    FullResults[U] = {}
    for api in (API1r, API2r, API3r):
        for v in api:
            k = dict()
            k.update(v)
            if (k.pop('url') == U):
                FullResults[U].update((k.items()+[('score', R)]))

Note that as the same url may be present in your different API s but with different information, we need to create the corresponding entry in FullResults beforehand, so it might be a bit tricky to simplify the loops. 请注意,由于相同的url可能出现在您的不同API但是具有不同的信息,因此我们需要事先在FullResults中创建相应的条目,因此简化循环可能有些棘手。 LMKHIW. LMKHIW。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM