根據兩列的值添加另一列

Question

我正在嘗試根據兩列的值添加另一列。 這是我的數據框的迷你版本。

data = {'current_pair': ['"["StimusNeu/2357.jpg","StimusNeu/5731.jpg"]"', '"["StimusEmo/6350.jpg","StimusEmo/3230.jpg"]"', '"["StimusEmo/3215.jpg","StimusEmo/9570.jpg"]"','"["StimusNeu/7020.jpg","StimusNeu/7547.jpg"]"', '"["StimusNeu/7080.jpg","StimusNeu/7179.jpg"]"'],
        'B': [1, 0, 1, 1, 0]
        }
df = pd.DataFrame(data)
df

                                    current_pair    B
0   "["StimusNeu/2357.jpg","StimusNeu/5731.jpg"]"   1
1   "["StimusEmo/6350.jpg","StimusEmo/3230.jpg"]"   0
2   "["StimusEmo/3215.jpg","StimusEmo/9570.jpg"]"   1
3   "["StimusNeu/7020.jpg","StimusNeu/7547.jpg"]"   1
4   "["StimusNeu/7080.jpg","StimusNeu/7179.jpg"]"   0

我希望結果是：

                                    current_pair    B   C
0   "["StimusNeu/2357.jpg","StimusNeu/5731.jpg"]"   1   1
1   "["StimusEmo/6350.jpg","StimusEmo/3230.jpg"]"   0   2
2   "["StimusEmo/3215.jpg","StimusEmo/9570.jpg"]"   1   0
3   "["StimusNeu/7020.jpg","StimusNeu/7547.jpg"]"   1   1
4   "["StimusNeu/7080.jpg","StimusNeu/7179.jpg"]"   0   2

我使用了 numpy 選擇命令：

conditions=[(data['B']==1 & data['current_pair'].str.contains('Emo/', na=False)),
            (data['B']==1 & data['current_pair'].str.contains('Neu/', na=False)),
            data['B']==0]
choices = [0, 1, 2]
data['C'] = np.select(conditions, choices, default=np.nan)

不幸的是，它給了我這個數據框，但沒有識別“C”列中帶有“1”的任何內容。

                                    current_pair    B   C
0   "["StimusNeu/2357.jpg","StimusNeu/5731.jpg"]"   1   0
1   "["StimusEmo/6350.jpg","StimusEmo/3230.jpg"]"   0   2
2   "["StimusEmo/3215.jpg","StimusEmo/9570.jpg"]"   1   0
3   "["StimusNeu/7020.jpg","StimusNeu/7547.jpg"]"   1   0
4   "["StimusNeu/7080.jpg","StimusNeu/7179.jpg"]"   0   2

任何幫助都很重要！ 多謝。

Answer 1

我認為這里有些邏輯錯誤； 這有效：

df.assign(C=np.select([df.B==0, df.current_pair.str.contains('Emo/'), df.current_pair.str.contains('Neu/')], [2,0,1]))

Answer 2

==1之后的()運算符優先級有問題：

conditions=[(data['B']==1) & data['current_pair'].str.contains('Emo/', na=False),
            (data['B']==1) & data['current_pair'].str.contains('Neu/', na=False),
             data['B']==0]

Answer 3

這是一個稍微更普遍的建議，很容易適用於更復雜的情況。 但是，您應該注意執行速度：

import pandas as pd
df = pd.DataFrame({'col_1': ['Abc', 'Xcd', 'Afs', 'Xtf', 'Aky'], 'col_2': [1, 2, 3, 4, 5]})
def someLogic(col_1, col_2):
    if 'A' in col_1 and col_2 == 1:
        return 111
    elif "X" in col_1 and col_2 == 4:
        return 999
    return 888
df['NewCol'] = df.apply(lambda row: someLogic(row.col_1, row.col_2), axis=1, result_type="expand")
print(df)

根據兩列的值添加另一列

問題描述

3 個解決方案

解決方案1
0 2021-11-02 08:49:48

解決方案2
0 已采納 2021-11-02 08:53:41

解決方案3
0 2021-11-02 09:24:46

根據兩列的值添加另一列

問題描述

3 個解決方案

解決方案1 0 2021-11-02 08:49:48

解決方案2 0 已采納 2021-11-02 08:53:41

解決方案3 0 2021-11-02 09:24:46

解決方案1
0 2021-11-02 08:49:48

解決方案2
0 已采納 2021-11-02 08:53:41

解決方案3
0 2021-11-02 09:24:46