[英]Finding the closest value in Pandas
X A B C D
10 10.2 16 28 30
20 27 15 14 16
30 30.4 34 35 45
我想創建一個新列,顯示來自 A、B、C 或 D 的值,以防它們接近“X”的 2%。
預期結果:
X A B C D result
10 10.2 16 28 30 10.2
20 27 15 14 16 NaN
30 30.4 34 35 45 30.4
我可以使用以下代碼識別接近“X”的值,但現在確定如何從 A、B、C、D 列中取出值:
cond = np.isclose(df.X, df['A'], rtol=0.02) | np.isclose(df.X, df['B'], rtol=0.02) | np.isclose(df.X, df['C'], rtol=0.02) | np.isclose(df.X, df['D'], rtol=0.02)
df['result'] = np.where(cond,#See note,np.nan)
#note:如何放置符合條件的列值?
很少有幫助將不勝感激。 謝謝!
編輯:
ID OP X Vl Ch A B C D result
10 10.2 16 28 30 10.2
20 27 15 14 16 NaN
30 30.4 34 35 45 30.4
空白值中有一些數據。 ID 是字符串,其他都是浮點數。
In [1]: df = pd.DataFrame([[10, 10.2, 16, 28, 30], [20, 27, 15, 14, 16], [30, 30.4, 34, 25, 35]], columns=["X", "A", "B", "C", "D"])
In [2]: def f(x):
...: # If there are multiple values that meet the conditions, please modify here
...: c = None # []
...: for i in x[1:]:
...: if i/x[0] >=1.0 and i/x[0] <=1.02:
...: c = i
...: break
...: # c.append(i)
...: return c
...:
In [3]: df['result'] = df.apply(lambda x: f(x), axis=1)
In [4]: df
Out[5]:
X A B C D result
0 10 10.2 16 28 30 10.2
1 20 27.0 15 14 16 NaN
2 30 30.4 34 25 35 30.4
val = [[10, 10.2, 16, 28, 30], [20, 27, 15, 14, 16], [30, 30.4, 34, 35, 45]]
df = pd.DataFrame(val, columns=list('XABCD'))
# split df for convenience:
target_X = df['X']
df2 = df.drop('X', axis=1)
tol = 0.02
# get the closest value to X within the tolerance tol:
df['Result'] = df2[df2.subtract(target_X, axis=0).abs().apply(lambda x: x <= target_X * tol)].min(1)
print(df)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.