简体   繁体   中英

Python dictionary list unique items

I have below input

 [{"ip": "1.2.3.4", "bytes": 10}, 
  {"ip": "2.3.4.10", "bytes": 10}, 
  {"ip": "5.6.2.3", "bytes": 10},
  {"ip": "1.2.3.4", "bytes": 20}, 
  {"ip": "1.2.3.4", "bytes": 5}, 
  {"ip": "2.3.4.10", "bytes": 1},
  {"ip": "10.20.30.40", "bytes": 0}, 
  {"ip": "0.0.0.0", "bytes": 10}, 
  {"ip": "2.3.4.10", "bytes": 6}]

Output as unique ip address with bytes added for duplicated ip addresses

[{'ip': '0.0.0.0', 'bytes': 10}, 
 {'ip': '10.20.30.40', 'bytes': 0}, 
 {'ip': '2.3.4.10', 'bytes': 17}, 
 {'ip': '5.6.2.3', 'bytes': 10}, 
 {'ip': '1.2.3.4', 'bytes': 35}]

I wrote a code like this in Python

import json
logs = """[{"ip": "1.2.3.4", "bytes": 10}, {"ip": "2.3.4.10", "bytes": 10}, {"ip": "5.6.2.3", "bytes": 10},
           {"ip": "1.2.3.4", "bytes": 20}, {"ip": "1.2.3.4", "bytes": 5}, {"ip": "2.3.4.10", "bytes": 1},
           {"ip": "10.20.30.40", "bytes": 0}, {"ip": "0.0.0.0", "bytes": 10}, {"ip": "2.3.4.10", "bytes": 6}]"""

logs_json = json.loads(logs)


ips_unique = set(ip.get("ip") for ip in logs_json)


ip_unique_list = []
for ip in ips_unique:
        ip_dict = {"ip": ip, "bytes": 0}
        ip_unique_list.append(ip_dict)

for ip_unique_sep in ip_unique_list:
        for log in logs_json:
                if log["ip"] == ip_unique_sep["ip"]:
                        ip_unique_sep["bytes"] += log["bytes"]

print(ip_unique_list)

Is there any better and efficient way to achieve the same?

You can use defaultdict initialized with int (to produce zeros by default), and just loop over the parsed input and sum the bytes from the IPs. Something like:

result = defaultdict(int)
for item in json_logs:
    result[item.get('ip')] += item['bytes']
print(result)

Using a simple loop:

out = {}
for d in logs_json:
    if d['ip'] in out:
        out[d['ip']]['bytes'] += d['bytes']
    else:
        out[d['ip']] = d.copy()

result = list(out.values())

Output:

[{'ip': '1.2.3.4', 'bytes': 35},
 {'ip': '2.3.4.10', 'bytes': 17},
 {'ip': '5.6.2.3', 'bytes': 10},
 {'ip': '10.20.30.40', 'bytes': 0},
 {'ip': '0.0.0.0', 'bytes': 10}]
out = dict.fromkeys((x['ip'] for x in logs_json), 0)
for x in logs_json:
    out[x['ip']] += x['bytes']

out = [{'ip':ip,'bytes':bytes} for ip, bytes in out.items()]
print(out)

# Output:

[{'ip': '1.2.3.4', 'bytes': 35}, 
{'ip': '2.3.4.10', 'bytes': 17}, 
{'ip': '5.6.2.3', 'bytes': 10}, 
{'ip': '10.20.30.40', 'bytes': 0}, 
{'ip': '0.0.0.0', 'bytes': 10}]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM