简体   繁体   English

在 Pandas 中寻找最接近的值

[英]Finding the closest value in Pandas

X     A    B   C    D 
10  10.2  16   28  30 
20  27    15   14  16
30  30.4  34   35  45

i want to create a new column which shows values from A,B,C or D in case they are 2% close to 'X'.我想创建一个新列,显示来自 A、B、C 或 D 的值,以防它们接近“X”的 2%。

Intended Result:预期结果:

X     A    B   C    D   result
10  10.2  16   28  30   10.2
20  27    15   14  16   NaN
30  30.4  34   35  45   30.4

I can identify values that are close to "X" with the following code but now sure how to bring out values from A,B,C,D columns:我可以使用以下代码识别接近“X”的值,但现在确定如何从 A、B、C、D 列中取出值:

cond = np.isclose(df.X, df['A'], rtol=0.02) | np.isclose(df.X, df['B'], rtol=0.02) | np.isclose(df.X, df['C'], rtol=0.02) | np.isclose(df.X, df['D'], rtol=0.02)

df['result'] = np.where(cond,#See note,np.nan)

#note: how do I put column value that meets the criteria? #note:如何放置符合条件的列值?

Little help will be appreciated.很少有帮助将不胜感激。 THANKS!谢谢!

Edit:编辑:

ID  OP     X   Vl  Ch  A    B   C    D   result
          10           10.2  16   28  30   10.2
          20           27    15   14  16   NaN
          30           30.4  34   35  45   30.4

Blank values have some data in them.空白值中有一些数据。 ID is string, others are float. ID 是字符串,其他都是浮点数。

In [1]: df = pd.DataFrame([[10, 10.2, 16, 28, 30], [20, 27, 15, 14, 16], [30, 30.4, 34, 25, 35]], columns=["X", "A", "B", "C", "D"])

In [2]: def f(x):
     ...:     # If there are multiple values that meet the conditions, please modify here
     ...:     c = None  # []
     ...:     for i in x[1:]:
     ...:         if i/x[0] >=1.0 and i/x[0] <=1.02:
     ...:             c = i
     ...:             break
     ...:             # c.append(i)
     ...:     return c
     ...:

In [3]: df['result'] = df.apply(lambda x: f(x), axis=1)

In [4]: df
Out[5]:
    X     A   B   C   D  result
0  10  10.2  16  28  30    10.2
1  20  27.0  15  14  16     NaN
2  30  30.4  34  25  35    30.4
val = [[10,  10.2,  16,   28,  30], [20, 27,  15,  14, 16], [30, 30.4,  34,  35, 45]]
df = pd.DataFrame(val, columns=list('XABCD'))

# split df for convenience:
target_X = df['X']
df2 = df.drop('X', axis=1)
tol = 0.02

# get the closest value to X within the tolerance tol:
df['Result'] = df2[df2.subtract(target_X, axis=0).abs().apply(lambda x: x <= target_X * tol)].min(1)
print(df)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM