簡體   English   中英

JSON數據的嵌套計數器

[英]Nested Counter for json data

我有一個JSON數據為:

{
    "persons": [
        {
            "city": "Seattle", 
            "name": "Brian"
            "dob" : "19-03-1980"
        }, 
        {
            "city": "Amsterdam", 
            "name": "David"
            "dob" : "19-09-1979"
        } 
       {
            "city": "London", 
            "name": "Joe"
            "dob" : "19-01-1980"
        }
        {
            "city": "Kathmandu", 
            "name": "Brian"
            "dob" : "19-03-1980"
        }
   ]
}

如何計算單個元素,例如在1月至12月出生(如果沒有人出生,則為0)和在給定年份使用python進行一次迭代的出生人數。 另外,每個月注冊的唯一名稱的數量,例如:

1980 :3
--Jan:1
--Mar:2
1979 :1
--Sep:1

名稱:

Mar 1980: 1 #Brian is same for both cities 
Jan 1980: 1
Sep 1979: 1

counters_mon是具有一年中特定月份值的計數器

for k_mon,v_mon in counters_mon.items():
    print('{}={}'.format(k_mon,v_mon))  

但是我也想打印細節。 我該如何實現?

import json    

f = open('/path/to/your/json', 'r')
persons = json.load(f)
years_months = {}
years_months_names = {}

for person in persons['persons']:
    year = person['dob'][-4:]
    month = person['dob'][3:5]
    month_year = month + ' ' + year
    name = person['name']

    if year not in years_months.keys():
        years_months[year] = { 'count': 1, 'months' : {} }
        if month not in years_months[year]['months'].keys():
            years_months[year]['months'][month] = 1
        else:
            years_months[year]['months'][month] += 1
    else:
        years_months[year]['count'] += 1
        if month not in years_months[year]['months'].keys():
            years_months[year]['months'][month] = 1
        else:
            years_months[year]['months'][month] += 1

    if month_year not in years_months_names.keys():
        years_months_names[month_year] = set([name])
    else:
        years_months_names[month_year].add(name)

for k, v in years_months.items():
    print(k + ': ' + str(v['count']))
    for month, count in v['months'].items():
        print("-- " + str(month) + ": " + str(count))
for k, v in years_months_names.items():
    print(k + ": " + str(len(v)))

我假設您具有json的路徑。 我還對您發布的JSON測試了我的答案,請注意確保JSON的結構正確。

這是使用defaultdicts( https://docs.python.org/3/library/collections.html#collections.defaultdict )的一個好案例。

data   # assume you have your data in a var called data

from collections import defaultdict
from calendar import month_abbr

# slightly strange construction here but we want a 2 levels of defaultdict followed by lists
aggregate = defaultdict(lambda:defaultdict(list))

# then the population is super simple - you'll end up with something like
# aggregate[year][month] = [name1, name2]
for person in data['persons']:
    day, month, year = map(int, person['dob'].split('-'))
    aggregate[year][month].append(person['name'])


# I'm sorting in chronological order for printing
for year, months in sorted(aggregate.items()):
    print('{}: {}'.format(year, sum(len(names) for names in months.values())))
    for month, names in sorted(months.items()):
        print('--{}: {}'.format(month_abbr[month], len(names)))

for year, months in sorted(aggregate.items()):
    for month, names in sorted(months.items()):
        print('{} {}: {}'.format(month_abbr[month], year, len(set(names))))

根據數據的使用方式,我實際上會考慮在聚合中不使用復雜的嵌套,而是選擇諸如aggregate[(year, month)] = [name1, name2,...] 我發現我的數據嵌套得越多,使用起來就越混亂。

編輯或者,您可以在第一遍創建多個結構,從而簡化打印步驟。 同樣,我使用defaultdict清理所有配置。

agg_years = defaultdict(lambda:defaultdict(int))   # [year][month] = counter
agg_years_total = defaultdict(int)   # [year] = counter
agg_months_names = defaultdict(set)   # [(year, month)] = set(name1, name2...)

for person in data['persons']:
    day, month, year = map(int, person['dob'].split('-'))

    agg_years[year][month] += 1
    agg_years_total[year] += 1
    agg_months_names[(year, month)].add(person['name'])


for year, months in sorted(agg_years.items()):
    print('{}: {}'.format(year, agg_years_total[year]))
    for month, quant in sorted(months.items()):
        print('--{}: {}'.format(month_abbr[month], quant))

for (year, month), names in sorted(agg_months_names.items()):
    print('{} {}: {}'.format(month_abbr[month], year, len(names)))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM