在Pandas / Python中合並數據框中的列值

Question

我正在嘗試合並同一數據框中的列（B和C列）的值。 B和C有時具有相同的值。 B中的某些值存在於C中，而C中的某些值存在於B中。最終結果將顯示一列，該列是兩列的組合。

初始數據：

 A          B          C            D
Apple    Canada        ''          RED
Bananas    ''          Germany     BLUE
Carrot     US          US          GREEN
Dorito     ''          ''          INDIGO

預期數據：

 A          B         C
Apple    Canada      RED
Bananas  Germany      BLUE
Carrot     US        GREEN
Dorito     ''        INDIGO

Answer 1

IIUC

df['B']=df[['B','C']].replace("''",np.nan).bfill(1).loc[:,'B']
df=df.drop('C',1).rename(columns={'D':'C'})
df
Out[102]: 
         A        B       C
0    Apple   Canada     RED
1  Bananas  Germany    BLUE
2   Carrot       US   GREEN
3   Dorito      NaN  INDIGO

Answer 2

您可以對字符串進行排序並采用最后一個字符串：

df['B'] = df[['B', 'C']].apply(lambda x: x.sort_values()[1], axis=1)

df=df.drop('C', 1).rename(columns={'D':'C'})    
print(df)

輸出：

         A        B       C
0    Apple   Canada     RED
1  Bananas  Germany    BLUE
2   Carrot       US   GREEN
3   Dorito       ''  INDIGO

Answer 3

另一種方法是巧妙地使用列表理解：

# Make sets of the column B and C combined to get rid of duplicates
k = [set(b.strip() for b in a) for a in zip(df['B'], df['C'])]

# Flatten sets to strings
k = [''.join(x) for x in k]

# Create desired column
df['B'] = k
df.drop('C', axis=1, inplace=True)

print(df)
         A        B       D
0    Apple   Canada     RED
1  Bananas  Germany    BLUE
2   Carrot       US   GREEN
3   Dorito           INDIGO

在Pandas / Python中合並數據框中的列值

問題描述

初始數據：

預期數據：

3 個解決方案

解決方案1
2 2019-05-07 17:27:10

解決方案2
1 已采納 2019-05-07 18:38:30

解決方案3
0 2019-05-07 18:03:39

在Pandas / Python中合並數據框中的列值

問題描述

初始數據：

預期數據：

3 個解決方案

解決方案1 2 2019-05-07 17:27:10

解決方案2 1 已采納 2019-05-07 18:38:30

解決方案3 0 2019-05-07 18:03:39

解決方案1
2 2019-05-07 17:27:10

解決方案2
1 已采納 2019-05-07 18:38:30

解決方案3
0 2019-05-07 18:03:39