简体   繁体   中英

Merge dictionaries if of the object key values is the same in python

I need help merging some dictionaries based on an object inside the list of dictionaries. Is this possible?

My data:

mongo_data = [{
 'url': 'https://goodreads.com/',
 'variables': [{'key': 'Harry Potter', 'value': '10.0'},
               {'key': 'Discovery of Witches', 'value': '8.5'},],
 'vendor': 'Fantasy' 
 },{
 'url': 'https://goodreads.com/',
 'variables': [{'key': 'Hunger Games', 'value': '10.0'},
               {'key': 'Maze Runner', 'value': '5.5'},],
 'vendor': 'Dystopia' 
 },{
 'url': 'https://kindle.com/',
 'variables': [{'key': 'Twilight', 'value': '5.9'},
               {'key': 'Lord of the Rings', 'value': '9.0'},],
 'vendor': 'Fantasy' 
 },{
 'url': 'https://kindle.com/',
 'variables': [{'key': 'The Handmaids Tale', 'value': '10.0'},
               {'key': 'Divergent', 'value': '9.0'},],
 'vendor': 'Fantasy' 
 }]

My code:

I used [ groupby ] to group items with the same URL together.

from itertools import groupby, chain
import json

searches = []
for key, group in groupby(mongo_data, key=lambda chunk: chunk['url']):
    search = {}
    search["url"] = key
    search["results"] = [{"genre": result["vendor"], "data": result["variables"]} for result in group]
    searches.append(search)

print(json.dumps(searches))

My Output

[
  {
    "url": "https://goodreads.com/",
    "results": [
      {
        "genre": "Fantasy",
        "data": [
          {
            "key": "Harry Potter",
            "value": "10.0"
          },
          {
            "key": "Discovery of Witches",
            "value": "8.5"
          }
        ]
      },
      {
        "genre": "Dystopia",
        "data": [
          {
            "key": "Hunger Games",
            "value": "10.0"
          },
          {
            "key": "Maze Runner",
            "value": "5.5"
          }
        ]
      }
    ]
  },
  {
    "url": "https://kindle.com/",
    "results": [
      {
        "genre": "Fantasy",
        "data": [
          {
            "key": "Twilight",
            "value": "5.9"
          },
          {
            "key": "Lord of the Rings",
            "value": "9.0"
          }
        ]
      },
      {
        "genre": "Fantasy",
        "data": [
          {
            "key": "The Handmaids Tale",
            "value": "10.0"
          },
          {
            "key": "Divergent",
            "value": "9.0"
          }
        ]
      }
    ]
  }
]

As you can see under https://kindle.com/ I have the "genre":"Fantasy" twice. Instead of it printing twice. Can I merge them without the duplicates.

So I want my expected result to be:

{
    "url": "https://kindle.com/",
    "results": [
      {
        "genre": "Fantasy",
        "data": [
          {
            "key": "Twilight",
            "value": "5.9"
          },
          {
            "key": "Lord of the Rings",
            "value": "9.0"
          },
          {
            "key": "The Handmaids Tale",
            "value": "10.0"
          },
          {
            "key": "Divergent",
            "value": "9.0"
          }
        ]
      }
    ]
  }
]

Is this possible?

You need a second groupby to group the result by vendor.

For instance:

searches = []
for key, group in groupby(mongo_data, key=lambda chunk: chunk['url']):
    search = {"url": key, "results": []}
    for vendor, group2 in groupby(group, key=lambda chunk2: chunk2['vendor']):
        result = {
            "genre": vendor,
            "data": [{"key": key, "value": value}
                     for result2 in group2
                     for key, value in result2["variables"]],
        }
        search["results"].append(result)
    searches.append(search)

The comprehension list is used to flatten the result2["variables"] and avoid a list of lists.

The result is:

[
 {
  "url": "https://goodreads.com/",
  "results": [
   {
    "genre": "Fantasy",
    "data": [
     {
      "key": "key",
      "value": "value"
     },
     {
      "key": "key",
      "value": "value"
     }
    ]
   },
   {
    "genre": "Dystopia",
    "data": [
     {
      "key": "key",
      "value": "value"
     },
     {
      "key": "key",
      "value": "value"
     }
    ]
   }
  ]
 },
 {
  "url": "https://kindle.com/",
  "results": [
   {
    "genre": "Fantasy",
    "data": [
     {
      "key": "key",
      "value": "value"
     },
     {
      "key": "key",
      "value": "value"
     },
     {
      "key": "key",
      "value": "value"
     },
     {
      "key": "key",
      "value": "value"
     }
    ]
   }
  ]
 }
]

You can use this code after your for loop to accomplish what you mentioned:

from collections import defaultdict

for item in searches:
    results = item['results']
    _res = defaultdict(list)
    for r in results:
        _res[r['genre']].append(r['data'])

    item['data'] = [{
        'genre': k,
        'data': _res[k]
    } for k in _res.keys()]

If you want a "one-line" (?), try this:

{"url": "https://kindle.com/", "results": [{"genre": k,"data": [v]} for k, v in {g:[y for x in [x['variables'] for x in mongo_data if x['vendor'] == g] for y in x] for g in set(x['vendor'] for x in mongo_data)}.items()]}

It yields

{
    'url': 'https://kindle.com/',
    'results': [
        {
            'genre': 'Fantasy',
            'data': [
                [
                    {'key': 'Harry Potter', 'value': '10.0'},
                    {'key': 'Discovery of Witches', 'value': '8.5'},
                    {'key': 'Twilight', 'value': '5.9'},
                    {'key': 'Lord of the Rings', 'value': '9.0'},
                    {'key': 'The Handmaids Tale', 'value': '10.0'},
                    {'key': 'Divergent', 'value': '9.0'}
                ]
            ]
        },

        {
            'genre': 'Dystopia',
            'data': [
                [
                    {'key': 'Hunger Games', 'value': '10.0'},
                    {'key': 'Maze Runner', 'value': '5.5'}
                ]
            ]
        }
    ]
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM