简体   繁体   English

Python 循环使用列表值的字典

[英]Python loop a dictionary with list values

I have the following dataframe:我有以下数据框:

    data = {'state': ['Rome', 'Venice', 'NY', 'Boston', 'London', 'Bristol'],
    'year': [2000, 2001, 2002, 2001, 2003, 2003],
    'number': [1.5, 1.7, 3.6, 2.4, 2.9, 3.2]}

df = pd.DataFrame(data)

and created a dictionary as per below:并按如下方式创建了一个字典:

dic = {
    'it':['Rome', 'Venice'], 
    'UK':['London', 'Bristol'],
    'US':['NY', 'Boston']
}

Is there a way to iterate through the dictionary, groupby year, find the mean number and create new dataframes named as the keys in the dictonary.有没有办法遍历字典,groupby year,找到平均数并创建新的数据帧,命名为字典中的键。

I have tried something like this but its not working....我尝试过这样的事情,但它不起作用......

for x, y in dic.items():
    x = df[df['state'].isin(y)].groupby(['year'], as_index=False)['numer'].mean()

For example the expected output for UK for would be the below:例如,英国的预期输出如下:

UK

    year    number
0   2003    3.05

You code is almost correct, just a typo in numer and store the results in dictionary:你的代码几乎是正确的,只是一个错字numer并存储在字典中的结果:

import pandas as pd

data = {'state': ['Rome', 'Venice', 'NY', 'Boston', 'London', 'Bristol'],
    'year': [2000, 2001, 2002, 2001, 2003, 2003],
    'number': [1.5, 1.7, 3.6, 2.4, 2.9, 3.2]}

dic = {
    'it':['Rome', 'Venice'],
    'UK':['London', 'Bristol'],
    'US':['NY', 'Boston']
}

df = pd.DataFrame(data)

out = {}
for x, y in dic.items():
    out[x] = df[df['state'].isin(y)].groupby(['year'], as_index=False)['number'].mean()

for country, df in out.items():
    print(country)
    print(df)
    print('-' * 80)

Prints:印刷:

it
   year  number
0  2000     1.5
1  2001     1.7
--------------------------------------------------------------------------------
UK
   year  number
0  2003    3.05
--------------------------------------------------------------------------------
US
   year  number
0  2001     2.4
1  2002     3.6
--------------------------------------------------------------------------------

A simplier way is to create a mapping with the continent as the value in the key/value pair.更简单的方法是创建一个以大陆作为键/值对中的值的映射。 Then replace the mapping of the state column into the continent column.然后将州列的映射替换为大陆列。 Last use the groupby function on continient and year and output the mean of the number column最后在大陆和年份上使用 groupby 函数并输出数字列的平均值

data = {'state': ['Rome', 'Venice', 'NY', 'Boston', 'London', 'Bristol'],
'year': [2000, 2001, 2002, 2001, 2003, 2003],
'number': [1.5, 1.7, 3.6, 2.4, 2.9, 3.2]}

mapping = {
'Rome':'it', 
'Venice':'it',
'London':'UK',
'Bristol':'UK',
'NY':'US',
'Boston':'US'
}

df = pd.DataFrame(data)
df['continent']=df['state'].replace(mapping)
print(df.head())
print(df.groupby(['continent','year'])['number'].mean())

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM