[英]pandas groupby using dictionary values, applying sum
I have a defaultdict: 我有一个defaultdict:
dd = defaultdict(list,
{'Tech': ['AAPL','GOOGL'],
'Disc': ['AMZN', 'NKE'] }
and a dataframe that looks like this: 和一个如下所示的数据框:
AAPL AMZN GOOGL NKE
1/1/10 100 200 500 200
1/2/10 100 200 500 200
1/310 100 200 500 200
and the output I'd like is to SUM the dataframe based on the values of the dictionary, with the keys as the columns: 我想要的输出是根据字典的值对数据帧进行求和,并将键作为列:
TECH DISC
1/1/10 600 400
1/2/10 600 400
1/3/10 600 400
The pandas groupby documentation says it does this if you pass a dictionary but all I end up with is an empty df using this code: pandas groupby文档说,如果你传递一个字典,它会这样做,但我最终得到的是使用此代码的空df:
df.groupby(by=dd).sum() ##returns empty df
Create the dict
in the right way , you can using by
with axis=1
以正确的方式创建
dict
,您可以使用by
axis=1
# map each company to industry
dd_rev = {w: k for k, v in dd.items() for w in v}
# {'AAPL': 'Tech', 'GOOGL': 'Tech', 'AMZN': 'Disc', 'NKE': 'Disc'}
# group along columns
df.groupby(by=dd_rev,axis=1).sum()
Out[160]:
Disc Tech
1/1/10 400 600
1/2/10 400 600
1/310 400 600
you can create a new dataframe using the defaultdict and dictionary comprehension in 1 line 您可以使用1行中的defaultdict和字典理解创建新的数据框
pd.DataFrame({x: df[dd[x]].sum(axis=1) for x in dd})
# output:
Disc Tech
1/1/10 400 600
1/2/10 400 600
1/310 400 600
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.