檢查數組值並將結果數組作為列添加到熊貓數據框

Question

我需要將數組作為列添加到 Dataframe：

results['TEST'] = results.apply(lambda x: results_02, axis=1)

結果我得到這樣的Dataframe：

ID TEST
1  [1,2,3,4,5,6,7,8,9,10]
2  [1,2,3,4,5,6,7,8,9,10]
3  [1,2,3,4,5,6,7,8,9,10]
4  [1,2,3,4,5,6,7,8,9,10]
5  [1,2,3,4,5,6,7,8,9,10]
6  [1,2,3,4,5,6,7,8,9,10]

但是我想添加條件以檢查results['ID'] in results_02是否將除現有值之外的所有值添加到該行，並且我需要為每一行執行此操作。

所以結果數據框需要是這樣的：

ID TEST
1  [2,3,4,5,6,7,8,9,10]
2  [1,3,4,5,6,7,8,9,10]
3  [1,2,4,5,6,7,8,9,10]
4  [1,2,3,5,6,7,8,9,10]
5  [1,2,3,4,6,7,8,9,10]
6  [1,2,3,4,5,7,8,9,10]

我認為我可以使用：

results['TEST'] = results.apply(lambda x: results_02[:10] if x not in results_02[:10] else results_02.remove(x)[:10], axis=1)

但我收到錯誤：

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

解決此問題的最佳和更優化的方法是什么？

EDIT_1：DF

data = {'ID': [250274, 244473, 240274, 247178, 248667]}

df = pd.DataFrame(data)
results_02 = [250274, 244473, 240274, 247178, 248667]

Answer 1

你可以試試這個：

import numpy as np
import pandas as pd

data = {'ID': [250274, 244473, 240274, 247178, 248667]}

results = pd.DataFrame(data)
result_02 = np.array([250274, 244473, 240274, 247178, 248667])

mask = results.values != result_02
results['TEST'] = [result_02[mask_row] for mask_row in mask]
results

----------------------------------------------
    ID       TEST
0   250274  [244473, 240274, 247178, 248667]
1   244473  [250274, 240274, 247178, 248667]
2   240274  [250274, 244473, 247178, 248667]
3   247178  [250274, 244473, 240274, 248667]
4   248667  [250274, 244473, 240274, 247178]
----------------------------------------------

如果您的數據框包含多列並且您只對ID列感興趣，那么您必須通過重塑您的 ID 數組來指定您的掩碼。

import numpy as np
import pandas as pd

data = {'ID': [250274, 244473, 240274, 247178, 248667], 'some_col': ['A', 'B', 'C', 'D', 'E']}

results = pd.DataFrame(data)
result_02 = np.array([250274, 244473, 240274, 247178, 248667])

mask = results.ID.values.reshape(-1, 1) != result_02
results['TEST'] = [result_02[mask_row] for mask_row in mask]

編輯

我不確定你的評論是什么意思。 我想你想要這樣的東西？

import numpy as np
import pandas as pd

data = {
    'ID1': [250274, 244473, 240274, 247178, 248667],
    'ID2': [244473, 240274, 247178, 248667, 250274],
}



results = pd.DataFrame(data)
result_02 = np.array([250274, 244473, 240274, 247178, 248667])

results['TEST'] = [result_02[~np.in1d(result_02, row)] for row in results.values]

------------------------------------------------
    ID1     ID2     TEST
0   250274  244473  [240274, 247178, 248667]
1   244473  240274  [250274, 247178, 248667]
2   240274  247178  [250274, 244473, 248667]
3   247178  248667  [250274, 244473, 240274]
4   248667  250274  [244473, 240274, 247178]
------------------------------------------------

如果不是，請讓您的評論更准確。

Answer 2

我使用了那個解決方案：

results['RESULTS'] = results['ID'].apply(lambda x: [i for i in result_02 if x!=i])

檢查數組值並將結果數組作為列添加到熊貓數據框

問題描述

2 個解決方案

解決方案1
1 已采納 2022-06-15 11:49:25

編輯

解決方案2
0 2022-06-20 06:55:50

檢查數組值並將結果數組作為列添加到熊貓數據框

問題描述

2 個解決方案

解決方案1 1 已采納 2022-06-15 11:49:25

編輯

解決方案2 0 2022-06-20 06:55:50

解決方案1
1 已采納 2022-06-15 11:49:25

解決方案2
0 2022-06-20 06:55:50