合並pandas中的條目

Question

假設我有下表。

import pandas as pd
sales = {'Account': ['Jones LLC', 'Alpha Co', 'Blue Inc'],
         '1': ['a', 'b', 'c'],
         '2': ['', 'e, g', 'f, h'],
         '3': ['a', 'g', 'h']}
df = pd.DataFrame.from_dict(sales).set_index('Account')
df

輸出：

               1  2     3
    Account
    Jones LLC  a        a
    Alpha Co   b  e, g  g
    Blue Inc   c  f, h  h

我想創建另一個列'4'，結果是第1,2和3列的組合：

           1  2     3   4
Account
Jones LLC  a        a   a
Alpha Co   b  e, g  g   b, e, g
Blue Inc   c  f, h  h   c, f, h

我嘗試使用以下內容：

df['4'] = [', '.join([df['1'][x],df['2'][x],df['3'][x]]) for x in range(df.shape[0])]

輸出：

           1  2     3   4
Account
Jones LLC  a        a   , a
Alpha Co   b  e, g  g   b, e, g, g
Blue Inc   c  f, h  h   c, f, h, h

問題是：

在第一行中，這是, a而不是a
重復： b, e, g, g而不是b, e, g
我需要明確寫出df['1'][x], df['2'][x], df['3'][x]而不是定義列表['1','2','3']並迭代列表。

我想知道是否有一種快速的方法可以在不使用df.iterrows()情況下執行此操作，檢查是否有任何條目為空，然后根據需要進行組合？

Answer 1

看起來您需要排除空列，並刪除重復項。

碼：

df['4'] = [', '.join(sorted(set(sum(
    [[y.strip() for y in df[c][x].split(',')] 
     for c in '123' if df[c][x].strip()], []))))
    for x in range(df.shape[0])]

測試代碼：

import pandas as pd
sales = {'Account': ['Jones LLC', 'Alpha Co', 'Blue Inc'],
         '1': ['a', 'b', 'c'],
         '2': ['', 'e, g', 'f, h'],
         '3': ['a', 'g', 'h']}
df = pd.DataFrame.from_dict(sales).set_index('Account')

df['4'] = [', '.join(sorted(set(sum(
    [[y.strip() for y in df[c][x].split(',')] 
     for c in '123' if df[c][x].strip()], []))))
    for x in range(df.shape[0])]

結果：

           1     2  3        4
Account                       
Jones LLC  a        a        a
Alpha Co   b  e, g  g  b, e, g
Blue Inc   c  f, h  h  c, f, h

Answer 2

替代方案：

In [59]: df[4] = (df.replace(r'[\s,]*','',regex=True)
    ...:            .sum(1)
    ...:            .str.extractall(r'(.)')
    ...:            .unstack()
    ...:            .apply(lambda x: ','.join(set(x.dropna())), axis=1))
    ...:

In [60]: df
Out[60]:
           1     2  3      4
Account
Jones LLC  a        a      a
Alpha Co   b  e, g  g  b,e,g
Blue Inc   c  f, h  h  c,f,h

合並pandas中的條目

問題描述

2 個解決方案

解決方案1
2 已采納 2017-05-15 21:48:37

解決方案2
1 2017-05-15 22:04:32

合並pandas中的條目

問題描述

2 個解決方案

解決方案1 2 已采納 2017-05-15 21:48:37

解決方案2 1 2017-05-15 22:04:32

解決方案1
2 已采納 2017-05-15 21:48:37

解決方案2
1 2017-05-15 22:04:32