簡體   English   中英

如何從數據框列值創建多個列表

[英]How to create multiple lists from data frame column value

df1

    Ticker          Category
0      XOM           Group 1
1      CVX           Group 1
2  RDSA-GB           Group 2
3    BP-GB  Group 1, Group 2
4  EQNR-NO           Group 3
5    FP-FR           Group 4
6   ENI-IT  Group 3, Group 4
7      COP           Group 5

我想要的結果將根據“類別”列創建“代碼”列表,並在用“_”替換空格時列出“類別”值的名稱

其次,如果存在 Category 有兩個值的實例,例如“US Major, Euro Major”,那么我如何確保“Ticker”最終出現在兩個 Category 列表中?

Group_1 = ['XOM','CVX','BP-GB']
Group_2 = ['RDSA-GB','BP-GB']
Group_3 = ['EQNR-NO','ENI-IT']
Group_4 = ['FP-FR','ENI-IT']
Group_5 = ['COP']

謝謝!

好吧,你說列出名單,我想你的意思是用字典的方式? 如果是這種情況,試試這個:

import pandas as pd

df =  pd.DataFrame([["XOM","US Major"],
["CVX","US Major"],
["RDSA-GB","Euro Major"],
["BP-GB","Euro Major"],
["EQNR-NO","Euro Major"]],columns=["Ticker","Category"])

df_to_lists = df.groupby("Category")["Ticker"].apply(list)
lists_to_dict = dict(df_to_lists)
print(lists_to_dict)

output:

{'Euro Major': ['RDSA-GB', 'BP-GB', 'EQNR-NO'], 'US Major': ['XOM', 'CVX']}

如果您不想要字典,則 df_to_lists 輸出:

Category
Euro Major    [RDSA-GB, BP-GB, EQNR-NO]
US Major                     [XOM, CVX]
Name: Ticker, dtype: object

你也可以像這樣使用循環的力量(我假設我的df是你的df1 ):

lists_with_unique_vals = dict()
for cat in df.Category.unique():
    lists_with_unique_vals[cat.replace(' ', '_')] = list(df[df['Category']==cat]['Ticker'].unique())

結果如下:

>> print(lists_with_unique_vals)
{'US_Major': ['XOM', 'CVX'], 'Euro_Major': ['RDSA-GB', 'BP-GB', 'EQNR-NO']}

跟進@nassiam 的代碼以處理可能有多個類別的情況,

import pandas as pd

df =  pd.DataFrame([["XOM","US Major"],
["CVX","US Major"],
["RDSA-GB","Euro Major"],
["BP-GB","Euro Major"],
["EQNR-NO","Euro Major"],
["ABC-XYZ", "Euro Major, US Major"],
["DEF-GHI", "Euro Major, US Major"]], columns=["Ticker","Category"])

df_to_lists = df.groupby("Category")["Ticker"].apply(list)
lists_to_dict = dict(df_to_lists)
print(lists_to_dict)

# Till here it is the same code as @nassiam pointed out

# To handle multiple-valued category
keys = lists_to_dict.keys()
for key in keys:
    categories = [x.strip() for x in key.split(',')]
    if len(categories) > 1:
        for cat in categories:
            if cat in lists_to_dict:
                lists_to_dict[cat] += lists_to_dict[key]
            else:
                lists_to_dict[cat] = lists_to_dict[key]
        lists_to_dict.pop(key, None)

# To replace space with underscore
for key in lists_to_dict:
    lists_to_dict[key.replace(" ", "_")] = lists_to_dict.pop(key)

假設第一列Ticker具有唯一值。 否則,在附加列表時使用set使它們唯一。 我希望這有幫助。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM