根据计算值从Pandas DataFrame中的行中查找数据？

Question

As an extension of my previous question , I would like take a DataFrame like the one below and find the correct row from which to pull data from column C and place it into column D based upon the following criteria: 作为我上一个问题的扩展，我想采用下面的DataFrame，并根据以下条件从C列中提取数据并将其放入D列中找到正确的行：

B_new = 2*A_old -B_old , ie. B_new = 2*A_old -B_old ，即 the new row needs to have a B equal to the following result from the old row: 2*A - B . 新行的B等于旧行的以下结果： 2*A - B 。
Where A is the same, ie. 其中A相同，即。 A in the new row should have the same value as the old row. 新行中的A应该具有与旧行相同的值。
Any values not found should use a NaN result 找不到任何值应使用NaN结果

Code: 码：

import pandas as pd
a = [2,2,2,3,3,3,3]
b = [1,2,3,1,3,4,5]
c = [0,1,2,3,4,5,6]

df = pd.DataFrame({'A': a , 'B': b, 'C':c})
print(df)

   A  B  C
0  2  1  0
1  2  2  1
2  2  3  2
3  3  1  3
4  3  3  4
5  3  4  5
6  3  5  6

Desired output: 所需的输出：

   A  B  C    D
0  2  1  0  2.0
1  2  2  1  1.0
2  2  3  2  0.0
3  3  1  3  6.0
4  3  3  4  4.0
5  3  4  5  NaN
6  3  5  6  3.0

Based upon the solutions in my previous question , I've come up with a method that uses a for loop to move thru each unique value of A : 根据上一个问题中的解决方案，我提出了一种使用for循环将A每个唯一值移动的方法：

for i in df.A.unique():
    mapping = dict(df[df.A==i][['B', 'C']].values)
    df.loc[df.A==i,'D'] = (2 * df[df.A==i]['A'] - df[df.A==i]['B']).map(mapping)

However, this seem clunky and I suspect there is a better way that doesn't make use of for loops, which from my prior experience tend to be slow. 但是，这似乎很笨拙，我怀疑还有一种更好的方法不使用for循环，根据我以前的经验，这种循环往往很慢。

Question: What's the fastest way to accomplish this transfer of data within the DataFrame? 问题：在DataFrame中完成此数据传输的最快方法是什么？

Answer 1

You could 你可以

In [370]: (df[['A', 'C']].assign(B=2*df.A - df.B)
           .merge(df, how='left', on=['A', 'B'])
           .assign(B=df.B)
           .rename(columns={'C_x': 'C', 'C_y': 'D'}) )
Out[370]:
   A  C  B    D
0  2  0  1  2.0
1  2  1  2  1.0
2  2  2  3  0.0
3  3  3  1  6.0
4  3  4  3  4.0
5  3  5  4  NaN
6  3  6  5  3.0

Details: 细节：

In [372]: df[['A', 'C']].assign(B=2*df.A - df.B)
Out[372]:
   A  C  B
0  2  0  3
1  2  1  2
2  2  2  1
3  3  3  5
4  3  4  3
5  3  5  2
6  3  6  1

In [373]: df[['A', 'C']].assign(B=2*df.A - df.B).merge(df, how='left', on=['A', 'B'])
Out[373]:
   A  C_x  B  C_y
0  2    0  3  2.0
1  2    1  2  1.0
2  2    2  1  0.0
3  3    3  5  6.0
4  3    4  3  4.0
5  3    5  2  NaN
6  3    6  1  3.0

根据计算值从Pandas DataFrame中的行中查找数据？

问题描述

1 个解决方案

解决方案1
1 2017-08-30 19:54:19

根据计算值从Pandas DataFrame中的行中查找数据？

问题描述

1 个解决方案

解决方案1 1 2017-08-30 19:54:19

解决方案1
1 2017-08-30 19:54:19