簡體   English   中英

Python Pandas 將值分組為 2 列

[英]Python Pandas to group values in 2 columns

如下所示的數據框。 名稱分為 5 組,由 A 列中的常見鏈接。

我想對名稱進行分組。 我試過了:

import pandas as pd

data = {'A': ["James","James","James","Edward","Edward","Thomas","Thomas","Jason","Jason","Jason","Brian","Brian"], 
'B' : ["John","Michael","William","David","Joseph","Christopher","Daniel","George","Kenneth","Steven","Ronald","Anthony"]}
df = pd.DataFrame(data)

df_1 = df.groupby('A')['B'].apply(list)
df_1 = df_1.to_frame().reset_index()

for index, row in df_1.iterrows():
    print (row['A'], row['B'])

輸出是:

('Brian', ['Ronald', 'Anthony'])
('Edward', ['David', 'Joseph'])
('James', ['John', 'Michael', 'William'])
('Jason', ['George', 'Kenneth', 'Steven'])
('Thomas', ['Christopher', 'Daniel'])

但我希望每個組都有一個列表(如果有一種自動方法為每個列表分配一個變量會更好),例如:

['Brian', 'Ronald', 'Anthony']
['Edward', 'David', 'Joseph']
['James', 'John', 'Michael', 'William']
['Jason', 'George', 'Kenneth', 'Steven']
['Thomas', 'Christopher', 'Daniel']

我試過row['B'].append(row['A'])但它返回None

將它們分組的正確方法是什么? 謝謝你。

您可以使用.name屬性在GroupBy.apply中添加A分組列的值:

s = df.groupby('A')['B'].apply(lambda x: [x.name] + list(x))
print (s)
A
Brian             [Brian, Ronald, Anthony]
Edward             [Edward, David, Joseph]
James      [James, John, Michael, William]
Jason     [Jason, George, Kenneth, Steven]
Thomas       [Thomas, Christopher, Daniel]
Name: B, dtype: object

你可以試試這個。 使用pd.Series.tolist()

for k,g in df.groupby('A')['B']:
    print([k]+g.tolist())

['Brian', 'Ronald', 'Anthony']
['Edward', 'David', 'Joseph']
['James', 'John', 'Michael', 'William']
['Jason', 'George', 'Kenneth', 'Steven']
['Thomas', 'Christopher', 'Daniel']

你得到None的原因是 output 是list.append返回None它會就地改變列表。

嘗試以下操作:

    import pandas as pd

    data = {'A': ["James","James","James","Edward","Edward","Thomas","Thomas","Jason","Jason","Jason","Brian","Brian"], 
    'B' : ["John","Michael","William","David","Joseph","Christopher","Daniel","George","Kenneth","Steven","Ronald","Anthony"]}
    df = pd.DataFrame(data)
    #display(df)
    df_1 = df.groupby(list('A'))['B'].apply(list)
    df_1 = df_1.to_frame().reset_index()

    for index, row in df_1.iterrows():
        ''' The value of  column A is not a list, 
so need to split the string and store in to a list and then concatenate with column B '''
        print(row['A'].split("delimiter") + row['B'])

output:

['Brian', 'Ronald', 'Anthony']
['Edward', 'David', 'Joseph']
['James', 'John', 'Michael', 'William']
['Jason', 'George', 'Kenneth', 'Steven']
['Thomas', 'Christopher', 'Daniel']

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM