简体   繁体   English

熊猫df到字典的值作为从df列聚合的python列表

[英]Pandas df to dictionary with values as python lists aggregated from a df column

I have a pandas df containing 'features' for stocks, which looks like this: 我有一个熊猫df,其中包含股票的“功能”,如下所示:

训练神经网络之前的股票特征

I am now trying to create a dictionary with unique sector as key , and a python list of tickers for that unique sector as values , so I end up having something that looks like this: 我现在正在尝试创建一个具有唯一扇区作为的字典,并以该作为该唯一扇区的python代码列表 ,因此最终得到的内容如下:

{'consumer_discretionary': ['AAP',
  'AMZN',
  'AN',
  'AZO',
  'BBBY',
  'BBY',
  'BWA',
  'KMX',
  'CCL',
  'CBS',
  'CHTR',
  'CMG',

etc. 等等

I could iterate over the pandas df rows to create the dictionary, but I prefer a more pythonic solution. 我可以遍历pandas df行以创建字典,但我更喜欢使用pythonic解决方案。 Thus far, this code is a partial solution: 到目前为止,此代码是部分解决方案:

df.set_index('sector')['ticker'].to_dict()

Any feedback is appreciated. 任何反馈表示赞赏。

UPDATE: 更新:

The solution by @wrwrwr @wrwrwr的解决方案

df.set_index('ticker').groupby('sector').groups

partially works, but it returns a pandas series as a the value, instead of a python list . 部分起作用,但是它返回一个pandas系列作为值,而不是python list Any ideas about how to transform the pandas series into a python list in the same line and w/o having to iterate the dictionary? 关于如何将pandas系列转换为同一行中的python列表并且无需迭代字典的任何想法?

Wouldn't f.set_index('ticker').groupby('sector').groups be what you want? f.set_index('ticker').groupby('sector').groups不是您想要的吗?

For example: 例如:

f = DataFrame({
        'ticker': ('t1', 't2', 't3'),
        'sector': ('sa', 'sb', 'sb'),
        'name': ('n1', 'n2', 'n3')})

groups = f.set_index('ticker').groupby('sector').groups
# {'sa': Index(['t1']), 'sb': Index(['t2', 't3'])}

To ensure that they have the type you want: 为确保它们具有所需的类型:

{k: list(v) for k, v in f.set_index('ticker').groupby('sector').groups.items()}

or: 要么:

f.set_index('ticker').groupby('sector').apply(lambda g: list(g.index)).to_dict()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM